Learn execution plan

https://www.mssqltips.com/sqlservertutorial/2250/graphical-query-plan-tutorial/

http://stackoverflow.com/questions/758912/how-to-read-an-execution-plan-in-sql-server

 

Advertisements

NoSQL 1

There are 4 types of NoSQL as below:
Key-Value Database
Document Database
Column-Family Database
Graph Database

NoSQL có gì hay ho – Tổng quan về NoSQL – Phần 2

Elasticsearch in 5 minutes

http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html

http://joelabrahamsson.com/elasticsearch-101/

Query DSL: syntax like json, has 2 parts: query and filter

 

Architecture: Core is Lucene. It has HTTP build on top

capture

Useful in: Blog, Analics (by find trend), Documents

Why use it: Speed, Relevancy, Statistics

Concept: save data in NoSQL Document (JSON)

capture

Shard??

Indexes alias??

Query DSL format: json

SQL Server Data Type Mappings

http://msdn.microsoft.com/en-us/library/cc716729.aspx

http://social.msdn.microsoft.com/Forums/sqlserver/en-US/d496c385-e0d9-4b98-81a4-1181270ea030/sql-server-vs-c-data-types

Difference between primary key and unique key

I am going to discuss some of the most important differences between Unique Key and Primary key in databases. Typically this is a very common topic but the most misleading as well. I am not going to write down the already known differences but will discuss the things that one should understand before deciding the primary key and unique key in database design.

example1

By definition , Primary key and Unique key technically means the same thing that is, a column or group of columns that can identify a uniqueness in a row. The role of both Unique key and primary key is the same in  this term (so technically they are same). The most common mistake that people do during database design is to create a primary key as ID int not null column and think that it is a unique identifier for the row. Although this is a correct statement but this primary key (Id column) is just a number added as an extra attribute to the table and technically does not belong to the definition of the data. Let me give you an example(See Fig 1):

The most common mistake that people do during database design is to create a primary key as ID int not null column and think that it is a unique identifier for the row. Although this is a correct statement but this primary key (Id column) is just a number added as an extra attribute to the table and technically does not belong to the definition of the data. Let me give you an example(See Fig 1):
The Fig 1 shows a table with Id column as PK . Now, this table consists of the Employee FName and LName. By looking at an example you can see that 1,John,Neville ; 2,John,Neville and so on is just a same thing. Adding Id column really did not make the row unique. So, Just by creating an Id column does not mean that you have  found a way to uniquely identify your row (Does not solve the problem). Basically what we are looking to do is to create a key that belongs to a data that can truly identify uniqueness.
A better way of designing a same table would be to add some more attributes to the table to make the row unique that also conforms to the meaning of the data. May be a add an SSN or something.

So the takeaway from this discussion is that always think about the unique key on the table in terms of the data related to the table and not by adding a value such as Id.

Creating a unique key that is data relevant helps the query statistics to remain consistent and helps the query optimizer to come with an optimal query execution plan that results in faster processing.

Another misconception that people have about Unique and Primary key is that Primary key always creates a Clustered Index and Unique key always creates a Non Clustered index. However this statement is true in a way that when you use a SQL server management studio and use a User Interface to create keys , it by default create clustered index on Primary key and Non Clustered index on Unique key  but it is not mandatory to do that. I would suggest always use scripts to create your keys.
So, it is not mandatory to create clustered index on primary or non clustered on unique key . You can create a clustered index on unique key and  non clustered on primary key.

ALTER TABLE dbo.Employee ADD  CONSTRAINT [PK_ID] PRIMARY KEY NONCLUSTERED(
[Id] ASC
)

This is a perfectly valid statement in SQL. So don’t be confused with this indexing for keys.
So to sum up, Following are the things that needs to be keep in mind before thinking about UK and PK
1) How can I make this row unique without introducing the fake Id column.
2) Technically Primary key and Unique key is the same thing
3) You can typically save the extra joins if there are some look up tables by figuring out the unique values. I will show this thing in my later blogs.
4) Sql server by default does not allow nulls in PK but you can put nulls in a UK column.
I hope you have enjoyed reading this blog. Please post on the comments if you have any thoughts to share.

Some concepts in DB

Database Replication: if the same data is stored on multiple storage devices

NULL values are ignored for all the aggregate functions

@@ERROR: Returns the error number for the last Transact-SQL statement executed.

SQL Wildcard Characters: are used with the SQL LIKE operator: %, _, [charlist], [^charlist], [!charlist]

Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. There are several benefits for using Normalization in Database.

Benefits:

  • Eliminate data redundancy
  • Improve performance
  • Query optimization
  • Faster update due to less number of columns in one table
  • Index improvement

http://msdn.microsoft.com/en-us/library/aa933055(v=sql.80).aspx

 

Denomalization on the contrary is the process of adding redundant data to speed up complex queries involving multiple table JOINS. One might just go to a lower form of Normalization to achieve Denormalization and better performance. Data is included in one table from another in order to eliminate the second table which reduces the number of JOINS in a query and thus achieves performance

Normalization Pros: no (or at least reduced) redundancy faster writes if just 1 smaller table needs updating/inserting/deleting ideal for transaction (OLTP) systems

Denormalization Pros: easier for nontechnical users to understand faster reads if all or most of the data is needed anyway ideal for reporting (OLAP) systems

http://msdn.microsoft.com/en-us/library/ms172432(v=sql.105).aspx

Database index

database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and the use of more storage space. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

Reference: http://en.wikipedia.org/wiki/Database_index

There are clustered indexes and non-clustered indexes in SQL Server. With non-clustered indexes, the leaf level nodes contain only the index data with a pointer to the associated data page where the remaining data resides. As a result, data access that uses a non-clustered index may cause extra reads of the data from the data page. With clustered indexes, the leaf level nodes of the B-tree contain the actual data rows for the table. There is only one clustered index per table. Remember that the clustering key is used in all non-clustered indexes as the row identifier, so choose them wisely.

Chapter 14 — Improving SQL Server Performance: http://msdn.microsoft.com/en-us/library/ms998577.aspx