Knowledge is Here...: May 2020

Friday, May 22, 2020

pg_dump Postgres Backup Restore Error

This is new error i faced while taking backup of database from Postgres.

setup-

1. Server A -> Location ABC -> Postgres Server version - 9.6.18

2. Machine B -> Location EDF -> Postgres Server version - 9.5.3

I created connection of Server A from machine B and initiate backup from machine B of Server A database. My intention was to bring down the server database to another machine.

In Postgres it is possible to create copy of backup at your local machine (machine B), if database is installed on another machine (as compare to in SQLServer it is not possible).

When i started taking backup, it throw below error:

pg_dump: server version: 9.6.11; pg_dump version: 9.5.3
pg_dump: aborting because of server version mismatch


Process returned exit code 1.

So, here i do not have option to install or re-install postgres of version 9.6.11 on my machine.

So what i did, postgres is open source, i get the pg_dump.jar file and pg_restore.jar file of version 9.6.11 and replaced the actual file of version 9.5.3.

Note* - take backup of pg_dump/pg_restore of 9.5.3 postgres install.

It is available in C directory of postgres/version /bin

Now take backup and restore it will work!!

Tuesday, May 19, 2020

How to Find Duplicate Records in Table | Interview Question

When generally asked for duplicate, every time for most of person start thing about the group by, count clause. Which is correct, but these will not work every where. Count(*), Group by , having are the for beginner level. Student get learnt from basics.

On other hand, when we are working in IT industry then window function are usable in real time examples.

What i think Count(*), group by, having are usable for OLAP database, and window function are usable at OLTP level.

Lets see result with some example:

XYZ
ID	FirstName	LastName	MiddleName
1	Gurjeet	Singh
2	Harry	Potter
3	Gurjeet	Singh
4	Prem	Singh	L
5	Harry	Potter

When i use the query

SELECT count(*), FirstName, LastName, MiddleName
FROM XYZ
GROUP BY  FirstName, LastName, MiddleName

So result will be

Count	FirstName	LastName	MiddleName
2	Gurjeet	Singh
2	Harry	Potter
1	Prem	Singh	L

I am agree the query return the correct result, but it is providing the Count and Name of Duplicate person. Means Gurjeet is the person exists by 2 times in table.

But for some cases we also need ID along with the duplicate person name, which is not possible with the help of GROUP by. So here we have to use window function.

select * from (
SELECT 
    ID, 
    FirstName, 
    LastName, 
    MiddleName,
    row_number() over(partition by Firstname, LastName, MiddleName order by ID) rowNumber
FROM XYZ
) t where rowNumber > 1

So how query will work, below table represent how the row_number will be assign to records

XYZ
ID	FirstName	LastName	MiddleName	rn
1	Gurjeet	Singh		1
2	Harry	Potter		1
3	Gurjeet	Singh		2
4	Prem	Singh	L	1
5	Harry	Potter		2

When we use the outer query, and apply the filter "where rowNumber > 1" below result set will be appear. Here main focus is column ID.

ID	FirstName	LastName	MiddleName
3	Gurjeet	Singh
4	Prem	Singh	L
5	Harry	Potter

Now in some cases we also need ID along with the all duplicate person name (only Duplicate), which is possible with the help of using Count(*) using Partition by clause.

select * from (
SELECT 
    ID, 
    FirstName, 
    LastName, 
    MiddleName,
    Count(*) over(partition by Firstname, LastName, MiddleName order by ID) TotalCount
FROM XYZ
) t where rowNumber > 1

XYZ
ID	FirstName	LastName	MiddleName	TotalCount
1	Gurjeet	Singh		2
2	Harry	Potter		2
3	Gurjeet	Singh		2
5	Harry	Potter		2

The transaction log for database 'Database Name' is full due to 'LOG_BACKUP

This error faced by me, during huge data insert in one table. In between Pentaho got crashed, using which i am inserting huge random data in table.

Can we fix this. If you are working in IT industry, should i do this my self, or should i approach the database administrator. I suggest we can do this in without any data loss, if you stuck in emergency requirement.

This error you will face cause of transaction logs got full cause of LOG_BACKUP when high insert update delete operation performed. At this stage we will not able to perform any kind of task on database.

To solve the issue we need to execute below steps:

Take Full database backup
Shrink log file to reduce the physical file size (ldf file)
Create a LOG Backup

Below are the steps need to perform on database:

ALTER DATABASE "Database Name" SET RECOVERY SIMPLE;
GO
DBCC SHRINKFILE ("Database Name", 1);  -- Shrink the log file to 1 MB
GO
ALTER DATABASE "Database Name" SET RECOVERY FULL;  
GO

Monday, May 18, 2020

A Very Unique Case : Delete Duplicate from Cross Referenced Table [Parent Child relationship]

Cross referenced table, i do not know if i am pronounce this is correctly or not. First see the table structure below, and let me know what this table is known as?, how to pronounce it? What is name of structure storing this kind of data.

XYZ
ID	CustomerID [Reference to Cus.]	RelatedCustomerID [Reference to Cus.]	Relation Ship of Customers
1	20021	20022	is children of
2	20022	20021	is father of
3	20023	20024	is friend of
4	20024	20023	is brother of
5	20025	20026	is children of

Do not forget to provide your inputs in comment section

Now the case is earlier the application is designed in way to hold the relationship between 2 customers in either way. Now as per new requirement this relationship are duplicate. We need to keep only 1 relation ship from cross reference rows.

example -

1. Row 1 and Row 2 are duplicate, we need to keep only 1 row (any one)

2. Row 3 and Row 4 are duplicate, we need to keep only 1 row (any one)

3. Row 5 is itself is unique

So did workaround with friends, of-course nothing is found on internet.google. Tried Differnt type of join , inner join , self join, min , max, dense rank, pivot but nothing works

Below is the query we prepared after all.

SELECT 
    distinct
    CASE
        WHEN CustomerID < RelatedCustomerID  THEN CustomerID 
        ELSE RelatedCustomerID 
        END AS CustomerID_v,
     CASE
        WHEN CustomerID  > RelatedCustomerID  THEN CustomerID 
        ELSE RelatedCustomerID 
        END AS RelatedCustomerID_V
FROM 
    XYZ

This could be a good interview question as well

Sunday, May 17, 2020

Multi Association (many to Many)

Multi Association is kind of association in database between 2 tables, generally this is theoretically term.

multi- association reference generally handle by below table, this is also called linking table. It has same synonyms like
cross-reference table,
bridge table,
join table,
map table,
intersection table,
linking table,
link table

Structure Explanation, ( There is 2 table, Customer is one table, Address is second table), I need to create link table where i can store ID of both tables. I will create a table Custom_address whose structure like as below.

Customer
ID	First Name	Last Name	Middle Name
1	Kind	Singh	s.
2	Nice	Singh

Address
ID	State	Country
1	Punjab	India
2	Goa	India
3	New York	United State

Customer_ID column has reference to Customer table, and Address_id column has reference to Address table

Customer Id	Address Id
1	1
1	3
2	1
2	2

Note*- Its also depends upon the database designer, some time user create ID column in link table sometimes not.

Friday, May 15, 2020

Best Way to Index Rebuild and Reorganize

Index rebuild is very essential part in database. Now a days OLTP database doing millions of transactions in system. TO keep performance keep of SQL query we need to daily schedule job to rebuild indexes.

I have gone through the internet, search many sites. Most of them suggested if Fragmentation is more than 30% we need rebuild the index.

And it is suggested if Fragmentation is between 5 and 30 then it is sufficient to perform Index Re-Organize.

Here i write the open procedure to handle both of case. It will automatically handle above situation and execute rebuild/re-organize of indexes accordingly

IF OBJECT_ID('IndexRebuildReOrg') IS NOT NULL
DROP TABLE IndexRebuildReOrg

BEGIN TRY
SET NOCOUNT ON;
DECLARE @v_obj_id INT;
DECLARE @v_ind_id INT;
DECLARE @partitioncount BIGINT;
DECLARE @v_part_num BIGINT;
DECLARE @v_frag FLOAT;
DECLARE @v_page_cnt INT;
DECLARE @Query NVARCHAR(4000);
DECLARE @v_obj_name nvarchar(255);
DECLARE @v_ind_name nvarchar(255);
DECLARE @v_schema_name nvarchar(255);

-- Load all index Name and fragmentation Value in Table

SELECT a.object_id AS v_obj_id ,
a.index_id AS v_ind_id ,
partition_number AS v_part_num ,
avg_fragmentation_in_percent AS v_frag ,
page_count AS page_count,
object_name (a.object_id) Obj_Name,
i.name as Ind_name ,
schema_name (schema_id) schema_name
INTO IndexRebuildReOrg
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL,'LIMITED') a
join sys.indexes i on i.index_id = a.index_id and i.object_id = a.object_id
join sys.objects o on i.object_id = o.object_id
WHERE a.index_id > 0;

-- Cursor Handling, close if already open
IF CURSOR_STATUS('global', 'indRebuild') >= -1
BEGIN
PRINT 'indRebuild CURSOR DELETED' ;
CLOSE indRebuild
DEALLOCATE indRebuild
END

-- Cursor for Index to be fragmentation/Open/Loop indRebuild cursor
DECLARE indRebuild CURSOR LOCAL FOR SELECT * FROM IndexRebuildReOrg where v_frag > 5;
OPEN indRebuild;
WHILE ( 1 = 1 )
BEGIN;
FETCH NEXT FROM indRebuild INTO @v_obj_id, @v_ind_id, @v_part_num, @v_frag, @v_page_cnt, @v_obj_name, @v_ind_name, @v_schema_name;

IF @@FETCH_STATUS < 0
BREAK;

If (@v_frag > 5 and @v_frag < 30)
BEGIN
SET @Query = N'ALTER INDEX ' + @v_ind_name + N' ON ' + @v_schema_name + N'.' + @v_obj_name + N' REORGANIZE';
END

If @v_frag > 30
BEGIN
SET @Query = N'ALTER INDEX ' + @v_ind_name + N' ON ' + @v_schema_name + N'.' + @v_obj_name + N' REBUILD';

IF @partitioncount > 1
SET @Query = @Query + N' PARTITION=' + CAST(@v_part_num AS NVARCHAR(10));
END
EXEC (@Query);
--print (@Query);

PRINT N'Rebuilding/Reorganizing index ' + @v_ind_name + ' on table ' + @v_obj_name;
PRINT N'Fragmentation: ' + CAST(@v_frag AS VARCHAR(15));
PRINT N'Page Count: ' + CAST(@v_page_cnt AS VARCHAR(15));
END;

CLOSE indRebuild;
DEALLOCATE indRebuild;

DROP TABLE IndexRebuildReOrg;

END TRY
BEGIN CATCH
PRINT 'ERROR ENCOUNTERED:' + ERROR_MESSAGE()
END CATCH

Sunday, May 10, 2020

Drop table If Already Exists SQLServer

Drop table if already exists in SQLServer. It is good Practice to keep this command before any create table script.

Type 1
IF EXISTS(SELECT * FROM dbo.test)
DROP TABLE dbo.test;

Type 2
IF OBJECT_ID('dbo.test', 'U') IS NOT NULL 
DROP TABLE dbo.test;

Type 3
IF exists (select * from INFORMATION_SCHEMA.TABLES where TABLE_NAME = 'test' AND TABLE_SCHEMA = 'dbo')
Drop table dbo.test;

Knowledge is Here...

Pages

Labels