Pages

Monday, July 24, 2023

NonParallelPlanReason="CouldNotGenerateValidParallelPlan" - one of the possible reasons

When you see your query doesn't go parallel and in the execution
plan XML you see NonParallelPlanReason="CouldNotGenerateValidParallelPlan".
Until SQL Server 2022 there is no indication why and what is the reason.

One of the possible reasons can be that you are querying a table which has
scalar function (UDF) in a check constraint or in a computed column.

Scalar UDFs in check constraints and in computed columns prevent parallelism (even for  SELECT Queries)

Try to disable the check constrain and check if it goes parallel. 

Monday, November 28, 2022

Minimizing deadlocks


A deadlock occurs when two or more sessions permanently block each other
by each session having a lock on a resource that the other session(s) are trying to lock.

For example:

Transaction A acquires a shared lock on row 1.
Transaction B acquires a shared lock on row 2.

Transaction A now requests an exclusive lock on row 2,
and is blocked until transaction B finishes and releases the shared lock it has on row 2.

Transaction B now requests an exclusive lock on row 1,
and is blocked until transaction A finishes and releases the shared lock it has on row 1.

We can group the deadlocks into the following categories:
  • Reader-Writer deadlocks
  • Writer-Writer Deadlocks
  • Key Lookup Deadlocks
  • Parallelism-related deadlocks
  • Range Scans and serializable deadlocks
  • Partition escalation deadlocks

Now let`s see what we can do to minimize and/or completely prevent the deadlocks:

1) Access the shared objects/tables in the same order.

    Ensure that all the resources are accessed in the same order all the time.
    For example, if two concurrent transactions both started by requesting
    the lock for row 1 and later on requesting the lock for row 2.
    This will simply be a blocking situation rather than a deadlock because
    transaction 1 will never be deadlocking transaction 2 as resource locks
    will never be held out of order.

2) Reduce a query's lock footprint by making the query as efficient as possible.

     Make sure the involved queries use row lock and not page/table locks.
     Try to read less data, avoid index scans.

3) In case of Key Lookups deadlocks,
    there are 2 possible solutions:

    a) Include columns to the existing non-clustered index, which will eliminate the lookup.
    b) In case the clustered index is small enough,
        Use FORCESCAN hint on the SELECT to force a clustered index scan

    Key Lookup deadlocks are the most common ones, here is an example:

    The SELECT uses the NC index to find a qualifying row(s)
    While holding a Shared lock on the NC index, it needs to jump over
    to the clustered index and retrieve some columns that aren't part of the NC index.
    While it's doing this, the UPDATE is busy doing a seek on the CL index.
    It finds a row, locks it and modifies it. But because one of the columns
    being modified is a key column in the NC index, it then has to move to the NC index
    and update that index, too. This requires a second X key lock on the NC index.
    So, the SELECT ends up blocked waiting for the UPDATE to release his X lock
    on the clustered index, while the UPDATE winds up blocked and waiting for the
    SELECT to release his S lock on the NC index.

4) Force the blocking to occur at an earlier point 

   Sometimes the best way to prevent a deadlock is to force the blocking to occur
   at an earlier point in one of the two transactions.

   For example, if you force process  A to be blocked by process B at the very
   beginning of A’s transaction, it may not have a chance to acquire the lock
   resource that later ends up blocking process B.
   HOLDLOCK and UPDLOCK hints can be useful for this.

5) Deadlock caused by parallelism
  
    Happens on parallel query plan executions,
    you will see "parallelism exchange event" in the deadlock graph/xml.

    One of the possible solution here is to add MAXDOP 1 hint to force the queries go serial,
    Another solution is to see if add an index can help to reduce the query cost
    and make the query to be executed in a single thread.

6) Deadlock between two MERGE statements
    
     Try to add HOLDLOCK hint to the MERGE statement,
     If it doesn't help, try to rewrite the MERGE to INSERT/UPDATE statements 


7) Use a lower isolation level

   Check if the transaction can run at a lower isolation level and 
   use the minimum necessary transaction isolation level.

   Under read committed, SQL Server will hold shared locks for a shorter duration than
   a higher isolation level, such as repeatable read/serializable.

8) Use RCSI

    Snapshot isolation uses row versioning, which does not use shared locks during read operations.
    Snapshot isolation levels can minimize deadlocks that occur between read and write operations.

9) Implement your own manual locking (sp_getapplock)
 
    If possible, consider to lock/unlock objects manually by yourself,
    This can be done by using sp_getapplock procedure.

    Add sp_getapplock to the 2 involved stored procedures or queries
    Something like below:
 
    begin tran
             sp_getapplock @Resource = 'MyTran', @LockMode = 'Exclusive'

             ... your code ...

    commit

   Only one client will be able to run this code at a time and it will be serialized.
   Don't forget to use sp_releaseapplock


Friday, September 9, 2022

Change Tracking cleanup doesn't work - negative cleanup_version / change_tracking_hardened_cleanup_version

So you have enabled the Change Tracking with AUTO_CLEANUP = ON,
but the CT cleanup doesn't work, no records deleted from the side tables
and from sys.syscommittab system table after the retention period passed.

When you run: 

-- Delete the data in sys.committab
exec sys.sp_flush_commit_table_on_demand @numrows  = 1000

you get negative value/number in change_tracking_hardened_cleanup_version 

The value returned by change_tracking_hardened_cleanup_version() is -XXXXXXXXX

Or you can see that the cleanup_version is negative from this query:

select sc.name as tracked_schema_name,
so.name as tracked_table_name,
ctt.cleanup_version
from sys.change_tracking_tables as ctt
inner join sys.objects as so on ctt.[object_id]=so.[object_id]
inner join sys.schemas as sc on so.schema_id=sc.schema_id
where so.name ='XXX-your-table-name-XXX'
go

Also when you run:

exec sp_flush_CT_internal_table_on_demand 

you see Total rows deleted: 0. in the output

If so, there is a fix from Microsoft, see it here.
Also you will need to enable trace flag 8290 and let the auto cleanup process run.

DBCC TRACEON( 8290, -1)
go
DBCC TRACESTATUS(8290)
go

Finally, you can see history of Change Tracking cleanups in the table below:

select * from dbo.MSchange_tracking_history

Friday, January 15, 2021

How to identify MySQL replication slave lag

Let's first understand how the standard asynchronous
MySQL replication works at a high level.





For simplicity, let's assume that there is a Master and one Slave node.
And MySQL version 5.6

Thus, the master maintains a special log called binary log which stores replication events.
This binary log file stores data that the replication slave will be reading later.

MySQL can use 3 formats for writing data to the binary log:

Statement-based - the SQL query itself is written to the binary log.
Row-based - the row changes are logged in the binary log.
Mixed - statement-based logging is used by default, but the logging mode
             switches automatically to row-based in certain cases.

The slave node has 2 processes/threads that are responsible for the replication:

IO Thread  - a process that pulls data from Master's binary log over the network
                    and writes the data/changes to a local file called a relay log

SQL Thread - a process that continuously reads the relay log
                     and applies the changes to the slave database.

So, in asynchronous replication, the slave node requests a packet from the master,
the master sends the packet and doesn't check if the slave received the data packet or not.

Also as you can see there is only one thread that reads the binary log from the master.

The slave node keeps track of the position in the master's binlog of the last event applied on the slave. This allows the slave server to re-connect and resume from where it left off after replication has been temporarily stopped.

When you see a replication lag, the first thing you need to understand is which of the two
replication threads(IO Thread or SQL Thread) is behind

For this, execute the commands below :

On the master: show master status;
On the slave:   show slave status;

To check where IO Thread currently is, compare the values from the following fields:

From the output of the show master status;

      File -  for example: mysql-bin-changelog.549445

From the output of the show slave status;

      Master_Log_File – last file copied from the master
                                  for example: mysql-bin-changelog.549440


Compare the numeric value from the "File" field from master status and
the value of "master_log_file" from slave status.
The difference will be how many binlogs the salve is behind.(IO Thread)

To get the current status of the SQL Thread check the following values:

Read_Master_Log_Pos – holds until what position the binary log from master
                                      is copied over to the relay log on the slave.

Exec_Master_Log_Pos – holds position from master binary log is being
                                     executed by SQL thread.

You can calculate the lag of SQL_Thread like this:
Read_Master_Log_Pos - Exec_Master_Log_Pos

Also, the "Seconds_Behind_Master" field should give you some sort of indication.

Possible reasons for IO_Thread lag:

- Slow network between master/slave  - See if enabling 'slave_compressed_protocol' helps
- Binary logging on the slave - It is IO intensive, you can disable it unless you need
                                               it for a point in time recovery.

Possible reasons for SQL Thread lag:

- Tables without PK/Unique key when you use row based binlog format

Wednesday, October 28, 2020

MySQL guide for SQL Server developers/DBAs - Part 1

Do you know what "My" means in MySQL?
"My" is the name of one of the co-founders Michael Widenius’s daughter.

If you are a SQL Server developer/DBA and need to switch to MySQL,
here are some useful gotchas:


SQL ServerMySQL
Last inserted
identity value
@@IDENTITY
SCOPE_IDENTITY()

Returns the last
automatically generated value
LAST_INSERT_ID()

Returns the first  (NOT LAST)
automatically generated value.
so, be careful when you insert multiple rows.
Nested transactionsSupportedNot supported.
START TRANSACTION statement will implicitly
commit any current transaction
Local variablesSupported everywhere:
in SPs and in batches/scripts
There are local variables and system variables.
Local variables can be used only inside SPs,
you cannot use them in script/batch.
if / while statementsSupported everywhereIF / WHILE statements are not supported in
scripts/batches.
They can be used only in SPs/Functions/Triggers
Alter stored procedurealter stored procedure statement
is supported
not supported, if you need to alter SP,
you need to drop it and recreate it
Stored procedures
parameters
Supports default values
and can be named referenced
no default values
and cannot be named referenced
PRINT statementSupportedNot supported
Division by zeroselect 1/0 will return an errorselect 1/0 will return NULL
Temp tablesa temp table can be referenced
as much as you want in a query
You cannot refer to a TEMPORARY table
more than once in the same query
Row_Number()Windows functions supportednot supported until version 8, can be emulated as
SET @row_number = 0;
SELECT
(@row_number:=@row_number + 1) RowId
Activity monitoringsp_who2, sp_whoisactive, sys.dm_exec_sessions, sys.dm_exec_requests show processlist, show full processlist; SELECT * FROM information_schema.processlist; Tables under performance_schema: -events_statements_history_long - events_statements_history - events_statements_current, mysql.slow_log
Open Transactionsdbcc opentran or SELECT * FROM sys.sysprocesses WHERE open_tran = 1 select * from INFORMATION_SCHEMA.INNODB_TRX; Use this query to see which transactions are waiting and which transactions are blocking them: SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query waiting_query, b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread, b.trx_query blocking_query FROM information_schema.innodb_lock_waits w INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

 Enjoy :)

Friday, October 23, 2020

Fork-and-Branch Git Workflow(GitHub)

Some companies use GitHub and Fork-and-Branch Git Workflow.
Let's see how it works in detail.

Each developer will have 2 server-side Git repositories: 
  • private(origin) 
  • public(upstream)
The diagram below shows the high level
of the Fork-and-Branch Git Workflow:
























The “fork and branch” workflow:
  • Fork the central GitHub repository.
  • Clone the forked repository to your local system.
  • Add a Git remote (upstream) for the original central repository.
  • Create a feature branch in which to place your changes.
  • Make your changes to the new branch.
  • Commit the changes to the branch.
  • Push the branch to GitHub.
  • Open a pull request from the new branch to the original central repository.
  • Clean up after your pull request is merged.

Forking Company GitHub Repository

The first step is to fork the company's central GitHub repository.
Forking it is basically making a copy of the repository,
but with a link back to the original.
  • Login into GitHub with your account.
  • Chose the central GitHub repository, for example: "CompanyName/RepoName".
  • Click the Fork button on the upper right-hand side of the repository’s page.
  • You should have now "YourName/RepoName" server-side repository.

Clone the forked repository

Navigate to the forked repository and look on the right-hand side of the web page.
You should see a button labeled as "Code"










Copy the URL there, and then use it with git clone like this:
git clone git@github.com:YourNameHere/RepoName.git
Adding the remote (upstream)

Now the server-side forked repository acts as Git remote and named as origin.
To use the “fork and branch” workflow, you’ll need to add a Git remote(upstream)
pointing back to the original central repository.
cd "YourLocalFolderWhereYouClonedTheRepo"
git remote add upstream git@github.com:CompanyName/RepoName.git
Now to test if everything is configured correctly, run the following 
command and you should see the output as below
git remote -v





Creating a Branch

So far, you’ve forked a central repository, cloned it down to your local system,
and added a Git remote(upstream) to your clone that points to the original
central repository on GitHub.

Now to make changes we need to create a Branch from the local repository.
After adding commit(s) to the new created branch,
we will need to push the created branch to GitHub and create a Pull Request
to merge the changes to the central repository.

To create a new branch and check it out, use this command:

git checkout -b <new branch name>
If you are creating a new branch not right after creating a new fork,
you should sync the fork with upstream before creating a new branch from the fork.
git checkout master
git pull upstream master
git push origin master

Commit the changes to the local repo

Once you are done with the changes, use git add .
command to stage all the changes,
it tells Git that you want to include updates to file(s) in the next commit.
git add .
Execute git commit -m "Your Comments" to apply the changes to your local repository.
git commit -m "Your Comments"

Pushing changes to GitHub

Once you’ve committed the changes to your local repository. 
The next step is to push those changes back up to GitHub.

If you were working in a branch called 
TestBranch,
then pushing the changes you made in that branch back to GitHub would look like this:

git push origin TestBranch

Opening a Pull Request

After you push the new branch up to your GitHub server-side repository, 
you need to create a pull request.

This can be done on the GitHub website.

After pushing the branch, browse the GitHub website
and you will see an option to create a pull request.


Merge Pull Request

Once your Pull Request is approved, you need to merge it to the central repository (upstream).

Cleaning up After a Merged Pull Request

First, you should update your local clone by using

git pull upstream master.

git pull upstream master
This pulls the changes from the original central repository’s (indicated by upstream) master branch (indicated by master in that command) to your local cloned repository. Afterward, you can delete the feature branch (because the changes are already in the master branch):
git checkout master
git branch -d <branch name>
Then you can update the master branch in your forked repository:
git push origin master
And push the deletion of the feature branch to your GitHub repository
git push --delete origin <branch name>

Keeping Your Fork in Sync

The forked repository doesn’t automatically stay in sync with the original repository.
To keep your fork in sync with the original repository, use these commands:

git checkout master
git pull upstream master
git push origin master
This pulls the changes from the original repository 
(the one pointed to by the upstream Git remote) and
pushes them to your forked repository (the one pointed to by the origin remote).

Monday, August 24, 2020

MySQL - ERROR 1298 (HY000): Unknown or incorrect time zone: 'UTC'

Let`s assume have a MySQL 5.6 Server installed on a Windows machine.
and you get the error below when trying to set time_zone to 'UTC' 
or a different time zone, follow the steps below to resolve this issue.

SET time_zone = 'UTC';
ERROR 1298 (HY000): Unknown or incorrect time zone: 'UTC'

The issue is you do not have the time zones data installed,
to install the time zones on a Windows machine, you will need:

1) Download "timezone_2020a_posix.zip - POSIX standard"

2) Stop the MySQL service

3) Extract the files from the downloaded file 
    and copy them with overwrite
    to C:\ProgramData\MySQL\MySQL Server 5.6\data\mysql
    
    Yes, to C:\ProgramData\.. and not to C:\Program Files (x86)\MySQL\...

4) Start the MySQL service

Enjoy:)