Labels

Apache Hadoop (3) ASP.NET (2) AWS S3 (2) Batch Script (3) BigQuery (21) BlobStorage (1) C# (3) Cloudera (1) Command (2) Data Model (3) Data Science (1) Django (1) Docker (1) ETL (7) Google Cloud (5) GPG (2) Hadoop (2) Hive (3) Luigi (1) MDX (21) Mongo (3) MYSQL (3) Pandas (1) Pentaho Data Integration (5) PentahoAdmin (13) Polybase (1) Postgres (1) PPS 2007 (2) Python (13) R Program (1) Redshift (3) SQL 2016 (2) SQL Error Fix (18) SQL Performance (1) SQL2012 (7) SQOOP (1) SSAS (20) SSH (1) SSIS (42) SSRS (17) T-SQL (75) Talend (3) Vagrant (1) Virtual Machine (2) WinSCP (1)

Wednesday, February 22, 2017

Django Migrate Sqllite to MySql DB

Steps to migrate Django Sqllite DB to MySql DB

1. python manage.py dumpdata -o datadump.json
2. Change settings.py to your mysql

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'database',
        'USER': 'username',
        'PASSWORD': 'password',
        'HOST': 'localhost',   # Or an IP Address that your DB is hosted on
        'PORT': '3306',
    }
}


3. Check you have mysqlclient, else run below command: 
       
       pip install mysqlclient

4. python manage.py migrate --run-syncdb5. python manage.py loaddata datadump.json

Friday, January 27, 2017

Setting up Python Virtual Environment


Follow below steps to create a virtual environment with python 2.7 version, 
you can replace it with any other version.

    $ pip install virtualenv
    $ virtualenv -p python2.7 mypython27
   Run below command to activate python 2.7 virtualenv

   > Go to mypython27 directory
   $ . bin/activate

Wednesday, January 4, 2017

Determining _PARTITION details in BigQuery Partitioned Table



Run below query to check the partition summary of Bigquery table:

SELECT DATE(_PARTITIONDATE) AS PT, DATE(CURRENT_TIMESTAMP()) , DATE(DATE_ADD(CURRENT_TIMESTAMP(), -1, 'DAY'))
FROM [ProjectId:Dataset.Table]
GROUP BY PT

SELECT project_id, dataset_id, table_id, partition_id
, MSEC_TO_TIMESTAMP(creation_time) Created_date, MSEC_TO_TIMESTAMP(last_modified_time) modified_time
from [ProjectId:Dataset.Table$__PARTITIONS_SUMMARY__]

Friday, December 23, 2016

Install AWS Components in Python Virtual Environment

Installing AWS Components in Python Virtual Environment

Step 1: >>$ sudo su
Step 2: >>$ pip install awscli
Step 3: >>$ aws configure
Complete the authorization steps with your AWS Key and Secret Key.

Install gcloud components in python virtual environment

Installing gcloud components in virtual environment

Step 1: Download google cloud SK file (below is an example for Mac)
Step 2. Unzip the tar file, navigate to bin folder
>> ./install.sh
if you face any issues like Module Platform missing, execute below command:
>>$ export CLOUDSDK_PYTHON_SITEPACKAGES=1

Step 3: Run gcloud init
Step 4: Open new terminal and run 'bq ls', a authorization steps occurs, complete the authorization.

Monday, November 14, 2016

How to Change Default Version of Python to another Version

Follow the below steps to change default version of python to another version:

Step 1: Go to /home/<user> directory in your terminal.
Step 2: Run sudo vim ~/.bashrc_aliases
Step 3: Add alias python=python3
Step 4: Run source ~/.bashrc_aliases

This will change the version from default to python3 version.

Tuesday, October 25, 2016

Google Cloud BQ Command Line Data Load

'bq' is command line tool provided by Google Cloud Platform to access bigquery table and perform operations like DDL, DML, etc. 

Refer http://mahadevanrv.blogspot.in/2016/06/install-google-bigquery-command-line.html for GCloud installation.

Data load using bq involves three types:

1. Empty (Default): It writes data into an empty table, if data already exists it throws error.

bq query  ---n=1000 --destination_table=<table_name> 'SELECT * FROM [project:dataset.source_table];'

2. Replace: It replace a current table with newly obtained data output. It involves loss of existing data in a destination table. Use it wisely to perform incremental load which involves update and inserts.

bq query  ---replace --destination_table=<table_name> 'SELECT * FROM [project:dataset.source_table];'

3. Append: It appends new records to the existing table. If same command is executed more than one time it will create duplicate records. Can be used for incremental load which involves only data insert.

bq query  ---append_table --destination_table=<table_name> 'SELECT * FROM [project:dataset.source_table];'