Friday, February 26, 2016

Add/upload documents in image or varbinary datatype

This post related to SQL Server, question arise in my mind where all the documents goes in database, if we can save all documents images, videos  in database, answer is yes, we can do. I have started looking into this, how we can do this. So my knowledge which contains couple of hour and success is below. You try in SQLServer and create new table, then place any document at right place and change the path and execute the below query.

create table documents(id int not null, doctype varchar(4), content varbinary(max));

INSERT INTO documents(id, doctype, content)
SELECT
    2 as id,
    '.pdf' AS doctype,
    * FROM OPENROWSET(BULK N'D:\common\ebooks2\India-Year-Book-2009.pdf', SINGLE_BLOB) AS Document;
GO

select * from documents;
go


Add comments if you like the post.

How to pull out blob content out to a directory using kettle

This post related to , if you had saved the blob data in your table then how to get your data out from a database table, its quiet easy. Some of the database also have their own algorithm of doing this task, but i am doing this with ETL kettle tool which is pretty fast(its not pretty its incredibly fast), proceed with follow steps:

1. create table, add some blob data
2. create transformation and execute it


1. Create table and add blob data
follow below step/post

CREATE TABLE t (id NUMBER(10),doc BLOB, name varchar2(30) );

CREATE SEQUENCE t_seq;

CREATE OR REPLACE DIRECTORY documents AS 'd:\app';

CREATE OR REPLACE PROCEDURE upload_t (file_name_v IN varchar) AS
v_bfile BFILE;
v_blob BLOB;
BEGIN
INSERT INTO t (id, doc,name) vALUES (t_seq.NEXTVAL, empty_blob(),file_name_v) RETURN doc INTO v_blob;
v_bfile := BFILENAME('DOCUMENTS', file_name_v);
Dbms_Lob.Fileopen(v_bfile, Dbms_Lob.File_Readonly);
Dbms_Lob.Loadfromfile(v_blob, v_bfile, Dbms_Lob.Getlength(v_bfile));
Dbms_Lob.Fileclose(v_bfile);

COMMIT;
end;
/

EXEC upload_t('New Text Document.txt');


2. Build transformation as per below image:




Add below java script to step "Modified Java Script Value"


var File = java.io.File;
var FileOutputStream = java.io.FileOutputStream;
var BufferedOutputStream = java.io.BufferedOutputStream;
var out_msg;
var err_msg;
var filename = substr(getVariable('Internal.Transformation.Filename.Directory', '') +'/'+file_name, 7);
var outBytes = file;
try{
    var f = new File(filename);
      if (!f.exists()){
        f.createNewFile();
    }
 
    var fos = new FileOutputStream(f);
    var bos = new BufferedOutputStream(fos);
    bos.write(outBytes);
    bos.flush();
    bos.close();
 
    out_msg = 'succeed';
//    err_msg = 'OK';
 
}
catch (e){
    out_msg = 'Failed';
    err_msg = e.javaException.toString();
}


Save and execute transformation

It will fetch the blob data and build the file at path same as of transformation path. Moreover you can modify javascript to add custom path.

NOTE*- file_name must contain the value as file name with postfix extension (format - filename.extension).

BENIFIT - benefit of using this code is it will work for ORACLE and SQLServer Database (these two are tested by me)
Oracle - Blob datatype
SQLServer - Image and Varbinary(max) datatype

Wednesday, February 24, 2016

Call job through kitchen, stop displaying error on batch [kettle]

Scenario -

 I am calling a job through batch, and job contains verification of database connection active inactive. In case if connection parameters are wrong then step 'Check Db connections' will throw error on batch file.

(Batch file code as below):


call kitchen /file:"%CD%\Data_Migration\zzzzzzzzz.kjb" /norep:Y /level:basic > ".\TEST%systemtime%.log"


Open CMD> execution *.bat file

E:\z_temp\kettle_5_4>test.bat
2016/02/24 13:36:56 - Check Db connections 2 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) :
2016/02/24 13:36:56 - Check Db connections 2 - Error occurred while trying to connect to the database
2016/02/24 13:36:56 - Check Db connections 2 -
2016/02/24 13:36:56 - Check Db connections 2 - Error connecting to database: (using class net.sourceforge.jtds.jdbc.Driver)
2016/02/24 13:36:56 - Check Db connections 2 - Login failed for user 'sa'.
2016/02/24 13:36:56 - Check Db connections 2 -
2016/02/24 13:36:56 - Check Db connections 2 - ]

Solution

If you don't want to show errors on log file then made changes as below, while call batch file


call kitchen /file:"%CD%\Data_Migration\zzzzzzzzz.kjb" /norep:Y /level:basic > ".\TEST%systemtime%.log" 2>&1


E:\z_temp\kettle_5_4>test.bat
E:\z_temp\kettle_5_4>

Sunday, February 14, 2016

Function to process each character of String [SQL server]

Some time nested function will not work, what is the solution? !!!!!!

Want to process each character of string???


--drop function dbo.replacechars
create FUNCTION replacechars (@InStr varchar(max))
RETURNS varchar(max)
AS
BEGIN
declare @alter_string varchar(max) , @length int , @cnt int , @StoreChar varchar(100)
SET @alter_string = ''
SET @length = 0
SET @cnt = 1
SET @length = LEN(@InStr)
IF @length > 0
BEGIN WHILE @cnt <= @length
BEGIN
      SET @StoreChar = SUBSTRING(@InStr, @cnt, 1)
      set @StoreChar =
            case when isnull(@StoreChar,' ') = ';' then '&#.x3b;'
                  when isnull(@StoreChar,' ') = '#' then '&#.x23;'
                  when isnull(@StoreChar,' ') = '&' then '&#.x26;'
                  when isnull(@StoreChar,' ') = '!' then '&#.x21;'
                  when isnull(@StoreChar,' ') = '@' then '&#.x40;'
                  when isnull(@StoreChar,' ') = '$' then '&#.x24;'
                  when isnull(@StoreChar,' ') = '%' then '&#.x25;'
                  when isnull(@StoreChar,' ') = '^' then '&#.x5e;'
                  when isnull(@StoreChar,' ') = '*' then '&#.x2a;'
                  when isnull(@StoreChar,' ') = '(' then '&#.x28;'
                  when isnull(@StoreChar,' ') = ')' then '&#.x29;'
                  when isnull(@StoreChar,' ') = '=' then '&#.x3d;'
                  when isnull(@StoreChar,' ') = '+' then '&#.x2b;'
                  when isnull(@StoreChar,' ') = '{' then '&#.x7b;'
                  when isnull(@StoreChar,' ') = '}' then '&#.x7d;'
                  when isnull(@StoreChar,' ') = '[' then '&#.x5b;'
                  when isnull(@StoreChar,' ') = ']' then '&#.x5d;'
                  when isnull(@StoreChar,' ') = '\' then '&#.x5c;'
                  when isnull(@StoreChar,' ') = '|' then '&#.x7c;'
                  when isnull(@StoreChar,' ') = '''' then '&#.x27;'
                  when isnull(@StoreChar,' ') = ':' then '&#.x3a;'
                  when isnull(@StoreChar,' ') = '"' then '&#.x22;'
                  when isnull(@StoreChar,' ') = '<' then '&#.x3c;'
                  when isnull(@StoreChar,' ') = '>' then '&#.x3e;'
                  when isnull(@StoreChar,' ') = '?' then '&#.x3f;'
                  when isnull(@StoreChar,' ') = '/' then '&#.x2f;'
                  when isnull(@StoreChar,' ') = CHAR(160) then '&#.xa0;'
                  when isnull(@StoreChar,' ') = CHAR(10) then '&#.xd;&#.xa;'
                  else isnull(@StoreChar,' ') end
BEGIN SET @alter_string = @alter_string + isnull(@StoreChar,' ') END
SET @cnt = @cnt + 1
END END
RETURN @alter_string END


Note - PLEASE remove . (dot) from string written after then clause

select dbo.replacechars('a ; f5637# $ % ^ ')

Replace Special Char from string for "table input step" kettle transformation

Some requirement i faced, to write dynamic SQL script for kettle transformation. SQL scripts are generally written in "table input step" and "SQL script" steps.

Observation i have made with two different version of data integration tool (kettle by pentaho) with 4.4 and 6.0.

Version 4.4
When we wrote some SQL script in transformation (table input step) , and if we open the saved transformation in notepad, i observed that the SQL script was in same format enclosed with html tags.

Version 5.4 and above (my observation, version are not accurate)
When i wrote some SQL script in transformation and i open the transformation in notepad, observed that the SQL script written in Spoon (developers mode), now has been in changed format. i seen that the special chars used in SQL script has been changed to some numbers.

I have listed out some special char conversions, as in below image:



Now the main task is how i can replace all above char in incoming String from any source.

Nested replace function will not work here, because replacing string contains itself special chars.

So i created own function, which will process each char of incoming string.

To see the function please refer to below link

Function to process each character of String [SQL server]

Friday, February 12, 2016

Split string on the first occurrence of underscore using kettle

Having a string contain underscore (string can have multiple underscores), want to split on first occurrence of underscore into two parts in kettle. To achieve this I used ‘Java Script’ step, which is quite easy and can perform using single step.


Below is the code for ‘Modified Java Script Value’ step:

var ToBeSplit = InString;
var IndexLevel = ToBeSplit.indexOf("_");
var LeftTrim = ToBeSplit.substr('_',IndexLevel);

var RightTrim = ToBeSplit.substr(IndexLevel+1);


web stats