Friday, May 22, 2020

pg_dump Postgres Backup Restore Error

This is new error i faced while taking backup of database from Postgres.

setup-
1. Server A -> Location ABC -> Postgres Server version - 9.6.18
2. Machine B -> Location EDF -> Postgres Server version - 9.5.3

I created connection of Server A from machine B and initiate backup from machine B of Server A database. My intention was to bring down the server database to another machine. 

In Postgres it is possible to create copy of backup at your local machine (machine B), if database is installed on another machine (as compare to in SQLServer it is not possible).

When i started taking backup, it throw below error:

pg_dump: server version: 9.6.11; pg_dump version: 9.5.3
pg_dump: aborting because of server version mismatch

Process returned exit code 1.
So, here i do not have option to install or re-install postgres of version 9.6.11 on my machine.

So what i did, postgres is open source, i get the pg_dump.jar file and pg_restore.jar file of version 9.6.11 and replaced the actual file of version 9.5.3.

Note* - take backup of pg_dump/pg_restore of 9.5.3 postgres install.
It is available in C directory of  postgres/version /bin

Now take backup and restore it will work!!

Tuesday, May 19, 2020

How to Find Duplicate Records in Table | Interview Question

When generally asked for duplicate, every time for most of person start thing about the group by, count clause. Which is correct, but these will not work every where. Count(*), Group by , having are the for beginner level. Student get learnt from basics.

On other hand, when we are working in IT industry then window function are usable in real time examples.

What i think Count(*), group by, having are usable for OLAP database, and window function are usable at OLTP level.

Lets see result with some example:

 XYZ   
 IDFirstName LastName MiddleName 
 1 Gurjeet Singh
 2 Harry Potter 
 3 Gurjeet  Singh  
 4 Prem Singh L
 5 Harry Potter 


When i use the query 

SELECT count(*), FirstName, LastName, MiddleName
FROM XYZ
GROUP BY  FirstName, LastName, MiddleName

So result will be 
CountFirstName LastName MiddleName 
 2 Gurjeet Singh
 2 Harry Potter 
 1 Prem Singh L

I am agree the query return the correct result, but it is providing the Count and Name of Duplicate person. Means Gurjeet is the person exists by 2 times in table.

But for some cases we also need ID along with the duplicate person name, which is not possible with the help of GROUP by. So here we have to use window function.

select * from (
SELECT 
    ID, 
    FirstName, 
    LastName, 
    MiddleName,
    row_number() over(partition by Firstname, LastName, MiddleName order by ID) rowNumber
FROM XYZ
) t where rowNumber > 1

So how query will work, below table represent how the row_number will be assign to records

 XYZ    
 IDFirstName LastName MiddleName  rn
 1 Gurjeet Singh
 1
 2 Harry Potter  1
 3 Gurjeet  Singh   2
 4 Prem Singh L 1
 5 Harry Potter 
 2

When we use the outer query, and apply the filter "where rowNumber > 1" below result set will be appear. Here main focus is column ID.
IDFirstName LastName MiddleName 
 3 Gurjeet Singh
 4 Prem Singh L
 5 Harry Potter 


Now in some cases we also need ID along with the all duplicate person name (only Duplicate), which is possible with the help of using Count(*) using Partition by clause.

select * from (
SELECT 
    ID, 
    FirstName, 
    LastName, 
    MiddleName,
    Count(*) over(partition by Firstname, LastName, MiddleName order by ID) TotalCount
FROM XYZ
) t where rowNumber > 1

XYZ    
 IDFirstName LastName MiddleName  TotalCount
 1 Gurjeet Singh
 2
 2 Harry Potter  2
 3 Gurjeet  Singh   2
 5 Harry Potter  2

The transaction log for database 'Database Name' is full due to 'LOG_BACKUP

This error faced by me, during huge data insert in one table. In between Pentaho got crashed, using which i am inserting huge random data in table.

Can we fix this. If you are working in IT industry, should i do this my self, or should i approach the database administrator. I suggest we can do this in without any data loss, if you stuck in emergency requirement.

This error you will face cause of transaction logs got full cause of LOG_BACKUP when high insert update delete operation performed. At this stage we will not able to perform any kind of task on database.

To solve the issue we need to execute below steps:
  • Take Full database backup
  • Shrink log file to reduce the physical file size (ldf file)
  • Create a LOG Backup

Below are the steps need to perform on database:
ALTER DATABASE "Database Name" SET RECOVERY SIMPLE;
GO
DBCC SHRINKFILE ("Database Name", 1);  -- Shrink the log file to 1 MB
GO
ALTER DATABASE "Database Name" SET RECOVERY FULL;  
GO

Monday, May 18, 2020

A Very Unique Case : Delete Duplicate from Cross Referenced Table [Parent Child relationship]

Cross referenced table, i do not know if i am pronounce this is correctly or not. First see the table structure below, and let me know what this table is known as?, how to pronounce it? What is name of structure storing this kind of data.

 XYZ   
 IDCustomerID [Reference to Cus.] RelatedCustomerID [Reference to Cus.]   Relation Ship of Customers
 1 20021 20022 is children of 
 2  20022 20021 is father of
 3  20023 20024 is friend of 
 4  20024 20023 is brother of
 5 20025 20026 is children of


Do not forget to provide your inputs in comment section


Now the case is earlier the application is designed in way to hold the relationship between 2 customers in either way. Now as per new requirement this relationship are duplicate. We need to keep only 1 relation ship from cross reference rows.

example - 
1. Row 1 and Row 2 are duplicate, we need to keep only 1 row (any one)
2. Row 3 and Row 4 are duplicate, we need to keep only 1 row  (any one)
3. Row 5 is itself is unique

So did workaround with friends, of-course nothing is found on internet.google. Tried Differnt type of join , inner join , self join, min , max, dense rank, pivot but nothing works

Below is the query we prepared after all.

SELECT 
    distinct
    CASE
        WHEN CustomerID < RelatedCustomerID  THEN CustomerID 
        ELSE RelatedCustomerID 
        END AS CustomerID_v,
     CASE
        WHEN CustomerID  > RelatedCustomerID  THEN CustomerID 
        ELSE RelatedCustomerID 
        END AS RelatedCustomerID_V
FROM 
    XYZ
 
This could be a good interview question as well

Sunday, May 17, 2020

Multi Association (many to Many)

Multi Association is kind of association in database between 2 tables, generally this is theoretically term.

multi- association reference generally handle by below table, this is also called linking table. It has same synonyms like
cross-reference table,
bridge table,
join table,
map table,
intersection table,
linking table,
link table

Structure Explanation, ( There is 2 table, Customer is one table, Address is second table), I need to create link table where i can store ID of both tables. I will create a table Custom_address whose structure like as below.

 Customer   
 ID First Name Last Name Middle Name
 1 KindSinghs. 
 2 NiceSingh 


Address  
 ID State Country
 1 PunjabIndia
 2 GoaIndia
 3 New YorkUnited State


Customer_ID column has reference to Customer table, and Address_id column has reference to Address table

 Customer Id Address Id
 11
 13
 21
 22


Note*- Its also depends upon the database designer, some time user create ID column in link table sometimes not.

Friday, May 15, 2020

Best Way to Index Rebuild and Reorganize

Index rebuild is very essential part in database. Now a days OLTP database doing millions of transactions in system. TO keep performance keep of SQL query we need to daily schedule job to rebuild indexes.

I have gone through the internet, search many sites. Most of them suggested if Fragmentation is more than 30% we need rebuild the index.

And it is suggested if Fragmentation is between 5 and 30 then it is sufficient to perform Index Re-Organize.

Here i write the open procedure to handle both of case. It will automatically handle above situation and execute rebuild/re-organize of indexes accordingly

IF OBJECT_ID('IndexRebuildReOrg') IS NOT NULL 
        DROP TABLE IndexRebuildReOrg

BEGIN TRY
    SET NOCOUNT ON;
    DECLARE @v_obj_id INT;
    DECLARE @v_ind_id INT;
    DECLARE @partitioncount BIGINT;
    DECLARE @v_part_num BIGINT;
    DECLARE @v_frag FLOAT;
    DECLARE @v_page_cnt INT;
    DECLARE @Query NVARCHAR(4000);
    DECLARE @v_obj_name nvarchar(255);
    DECLARE @v_ind_name nvarchar(255);
    DECLARE @v_schema_name nvarchar(255);
    
-- Load all index Name and fragmentation Value in Table

    SELECT  a.object_id AS v_obj_id ,
            a.index_id AS v_ind_id ,
            partition_number AS v_part_num ,
            avg_fragmentation_in_percent AS v_frag ,
            page_count AS page_count,
            object_name (a.object_id) Obj_Name,
            i.name as Ind_name ,
            schema_name (schema_id) schema_name
    INTO    IndexRebuildReOrg
    FROM    sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL,'LIMITED') a
    join sys.indexes i on i.index_id = a.index_id and i.object_id = a.object_id
    join sys.objects  o on i.object_id = o.object_id
    WHERE   a.index_id > 0;

-- Cursor Handling, close if already open
IF CURSOR_STATUS('global', 'indRebuild') >= -1
BEGIN
 PRINT 'indRebuild CURSOR DELETED' ;
    CLOSE indRebuild
    DEALLOCATE indRebuild
END

-- Cursor for Index to be fragmentation/Open/Loop indRebuild cursor
    DECLARE indRebuild CURSOR LOCAL FOR SELECT * FROM IndexRebuildReOrg where v_frag > 5;
    OPEN indRebuild;
    WHILE ( 1 = 1 )
        BEGIN;
            FETCH NEXT FROM indRebuild INTO @v_obj_id, @v_ind_id, @v_part_num, @v_frag, @v_page_cnt, @v_obj_name, @v_ind_name, @v_schema_name;

            IF @@FETCH_STATUS < 0
                BREAK;

            If (@v_frag > 5 and @v_frag < 30)
            BEGIN
            SET @Query = N'ALTER INDEX ' + @v_ind_name + N' ON ' + @v_schema_name + N'.' + @v_obj_name + N' REORGANIZE';
            END
            
            If @v_frag > 30 
            BEGIN
            SET @Query = N'ALTER INDEX ' + @v_ind_name + N' ON ' + @v_schema_name + N'.' + @v_obj_name + N' REBUILD';

            IF @partitioncount > 1
                SET @Query = @Query + N' PARTITION=' + CAST(@v_part_num AS NVARCHAR(10));
            END
                EXEC (@Query);
                --print (@Query); 

            PRINT N'Rebuilding/Reorganizing index ' + @v_ind_name + ' on table ' + @v_obj_name;
            PRINT N'Fragmentation: ' + CAST(@v_frag AS VARCHAR(15));
            PRINT N'Page Count: ' + CAST(@v_page_cnt AS VARCHAR(15));
        END;

    CLOSE indRebuild;
    DEALLOCATE indRebuild;

    DROP TABLE IndexRebuildReOrg;

END TRY
BEGIN CATCH
    PRINT 'ERROR ENCOUNTERED:' + ERROR_MESSAGE()
END CATCH

Sunday, May 10, 2020

Drop table If Already Exists SQLServer

Drop table if already exists in SQLServer. It is good Practice to keep this command before any create table script.

Type 1
IF EXISTS(SELECT * FROM dbo.test)
DROP TABLE dbo.test;

Type 2
IF OBJECT_ID('dbo.test', 'U') IS NOT NULL 
DROP TABLE dbo.test;


Type 3
IF exists (select * from INFORMATION_SCHEMA.TABLES where TABLE_NAME = 'test' AND TABLE_SCHEMA = 'dbo')
Drop table dbo.test;
web stats