Labels

Apache Hadoop (3) ASP.NET (2) AWS S3 (2) Batch Script (3) BigQuery (21) BlobStorage (1) C# (3) Cloudera (1) Command (2) Data Model (3) Data Science (1) Django (1) Docker (1) ETL (7) Google Cloud (5) GPG (2) Hadoop (2) Hive (3) Luigi (1) MDX (21) Mongo (3) MYSQL (3) Pandas (1) Pentaho Data Integration (5) PentahoAdmin (13) Polybase (1) Postgres (1) PPS 2007 (2) Python (13) R Program (1) Redshift (3) SQL 2016 (2) SQL Error Fix (18) SQL Performance (1) SQL2012 (7) SQOOP (1) SSAS (20) SSH (1) SSIS (42) SSRS (17) T-SQL (75) Talend (3) Vagrant (1) Virtual Machine (2) WinSCP (1)

Friday, March 27, 2015

Talend: Unable to write Chineese Characters into Text files

When we load Chineese/Japaneese  characters from excel input to text file output using Talend, you may see those characters are replaced with '???'. To overcome this we need to select 'UTF-8' mode in Encoding option available in Advanced Settings.

I tried this in Talend Big Data Integration and worked well for me.

Thursday, March 26, 2015

AWS Redshift: String contains invalid or unsupported UTF8 codepoints. Bad UTF8 hex sequence: b6

When you receive below error message from AWS redshift while executing use copy command then use 'ACCEPTINVCHARS ESCAPE' syntax in copy command:

Error Code: 'String contains invalid or unsupported UTF8 codepoints. Bad UTF8 hex sequence: b6 (error 3) '

COPY <tablename> from 's3://bucket/folder/file.txt'
CREDENTIALS '**********;aws_secret_access_key=******'
DELIMITER   '|' ACCEPTINVCHARS ESCAPE IGNOREHEADER 1;

General Copy Command to load S3 file data to Redshift

General Copy Command to load S3 file data to Redshift table:
COPY <tablename> from 's3://bucket/folder/file.txt'
CREDENTIALS 'aws_access_key_id=**********;aws_secret_access_key=*********'
DELIMITER   '|' IGNOREHEADER 1;