If you are developer, then you work with relational databases as well. And if you work in a Windows world, most likely your system of choice is Microsoft SQL Server.
With every new DBMS release we find something that changes the way we solve common problems in new solutions. SQL Server 2012 is no exception.
In SQL Server 2012, instead of just enhancing the productivity and robustness of the server, Microsoft solved a number of architectural problems we previously had to deal with in our applications when working with SQL Server.
Problems like saving files in databases, load balancing database clusters and simple restoration of data, finally got the minimum viable solution, which should only improve with time.
1. Store and Query Documents with File Tables
I would say that there are two common ways to store files in your web application. First one – you store them on disk, be it share or SAN or any other persistent storage, and register information about the file in the database. That way you can backup your files separately from the database, as the file content doesn’t fill the database files, and make sure that attachments become the responsibility of file server managers.
Second way, introduced in SQL 2005, is to use FILESTREAM. Same BLOBs stored in varbinary(max) fields, but physically stored differently, in NTFS stream. Also, FILESTREAM uses the NT system cache for caching file data. This helps reduce any effect that FILESTREAM data might have on Database Engine performance. The SQL Server buffer pool is not used; therefore, this memory is available for query processing.
Well, both methods have their (dis)advantages out of scope of this article, but the new method of storing data files is both simple and efficient.
In fact, that’s very similar method to the one I’ve used two years ago when I created my own file sending service, when I didn’t want to pay the filemail.com license. In short – you store the information about the files in the database, you store files in a file share, and monitor changes to the file system, immediately reflecting them in the database.
A few notes about File Tables:
- You can access information about your files using either normal T-SQL query, or by accessing the file directly.
- You can not change the schema or structure of the file table.
- You can’t create file tables in TEMP or SYS databases.
- Folder must reside in persistent storage, i.e. no memory drives. However, you can map a directory in memory disk and use it, as then SQL Server won’t know it’s volatile.
More information about File Tables in MSDN.
2. Statistical Semantic Search
How would you like to be able to select a word document, say, a CV, and with a single click of a button to find similar CVs stored in your database? It’s never been easier, thanks to the new feature of SQL Server 2012 – the Statistical Semantic Search.
SSS requires a bit more preparation, but results are like magic. First, you need to install the SSS database, which you may find in your installation media, the installation file is called SemanticLanguageDatabase.msi. It will basically extract the database file to the directory of your choice. Then, you’ll have to attach it to your instance of SQL Server, only one database per instance, and you’re ready.
Make sure you don’t change that database, it’s not meant to be edited.
In order to index Microsoft Office documents, especially the ones with the new document format, you’ll need to install Microsoft Office filter packs for Microsoft Search – either 32 or 64 bit.
More information about SQL Server Statistical Semantic Search in MSDN
3. AlwaysOn clustering
Some time ago I covered the SQL Server 2008 R2 failover clustering. It’s a must have function for any vital database. In case the term is new for you, here is the link to original article and video tutorial. The problem of failover cluster – there can be only one. You may have two servers running at the same time, but only one of them would serve the clients.
It’s not so with AlwaysOn feature of SQL Server 2012. Now you may have up to 5 nodes linked into one NLB cluster (quite similar to IIS NLB clusters, which I covered in 2011), which is also a failover cluster. Practically that means – if one computer fails, another node automatically takes the primary role, but all nodes are active and can serve clients, and some of these nodes may be serving specifically the read-only requests.
There is a principle, usually called “20/80”, which declares that usually no more than 20% of all requests to SQL server are ending with the change to the database. The rest are based on SELECTs. For example – websites and reports produce a lot of requests, but relatively not many changes.
With AlwaysOn cluster, you can specify the intention of your connection, and if it’s intention to get the read-only connection, most likely you’ll be connected to the read-only node.
Another benefit of AlwaysOn cluster is that connection between nodes may be synchronous or asynchronous, the latter could use your bandwidth more efficiently, but if the primary node would fail in process, some data could be lost. It depends from your cluster, anyway.
Whenever you need to perform maintenance on one of the nodes, you can manually switch the primary role to another node or simply set this node “offline”, so it won’t serve clients until the maintenance is over.
The AlwaysOn availability cluster doesn’t switch the master role to another node when there are problems with particular memory pages. See the last topic of this article for the ways to restore particular memory pages when needed.
More information about AlwaysOn Availability Groups in MSDN
4. Contained Databases
And as we are speaking about the clustering and replication of the databases, one thing that many SQL developers find confusing or complex is the way you authenticate and authorize with SQL Server. As you know, you must have a user and a login, linked to each other. Login allows you to provide credentials to log into the SQL Server, while user account is used to claim rights and own resources.
Normally, when you replicate the database, you don’t replicate the server-level logins, so you have to re-create them. Well, not necessarily if your database is contained.
The contained database allows you to authenticate against the database, not the server. This feature has it’s pros and cons, but in some scenarios, like deploying our database to many customers, it could save a lot of time on supporting users, as invalid server logon is one of the most common problems.
Existing databases may become contained, you just have to alter them and change the containment level to PARTIAL. There is no FULL level of containment, but the word “partial” implies that “full” may be introduced in the future.
One thing that is a bit tricky now is how the authentication works now. Take a look at the algorithm:
First thing that SQL Server checks is whether or not you provide the initial catalog (i.e. the database name) in connection string. If you don’t, there is no contained database authentication for you, you authenticate against the server.
However, if you do specify the database, and it’s contained database, then you will authenticate against the database user, and in that case you’d better not have two users with identical usernames in both database and server, as failing to authenticate as database user will lead to authentication failure, there won’t be a fallback to server-level authentication.
So always ensure your local database users have login names that you won’t ever have at server level. For example – dbJohnDoe instead of JohnDoe.
More about Contained Databases in MSDN
5. Transact-SQL Enhancements
As always, each new release of SQL Server brings new functions and syntax features. These are the most useful ones for me:
The first one, WITH RESULT SETS clause, allows you to automatically convert and rename the result you get from your stored procedure. In this example, stored procedure may return different fields with different names, but then the first field will be converted to nvarchar with limit of 20 chars and no NULL values, second column will be renamed and converted to integer, and so on.
This way we can ensure the backward compatibility between our stored procedures and data consumers, such as applications – you won’t have to rewrite and recompile the data consuming code just because you’ve changed the stored procedure. Wonderful!
The second one, THROW statement, is pretty much like RAISEERROR, but it’s much more like what you are using in programming languages like C# or VB – you can specify arbitrary error code, specify strict (i.e. non-formatted, unlike in RAISEERROR), and specify the arbitrary state byte. In RAISEERROR you would also have to specify the severity level, but here it’s always 16.
The OFFSET and FETCH keywords are self-explanatory – perfect way to page your data set output.
We’ve also got a new set of functions in T-SQL:
As you can see in this example, we’ve got closer to what we have in .NET. For example – TRY_PARSE will return either the result of conversion or NULL, but will not throw an exception. In the example above, we would get NULL, as we would try to convert 345 British pounds to the data type of money using the en-US culture, instead of en-GB. The same is with TRY_CONVERT.
Another function from this list that worth explaining is EOMONTH. As you may guess, it means “end of month”, and it returns the last day of the month, which contains the date that you specified as the parameter. So, if you would specify the 16th February of 2013, it would return 28th of February, as February only has 28 days. The last parameter is optional and it’s for offset, in months, so if you would ask for EOMONTH and specify 16/02/2013 as the date and 1 as the offset, you would get 31/03/2013 as the result date – the last day of the next month after February. The offset value could be negative as well.
6. Management Tool Enhancements
SQL Server Management Studio changed significantly, although everything looks suspiciously similar. That’s because the changes are not cosmetic, but functional. We’ve got a better intellisense, snippets and enhanced debugging, making Management Studio look and feel just like Microsoft Visual Studio.
Code snippets basically allow you to paste pre-defined code fragments from intellisense menu, and it’s technique which increases productivity of .NET developers for years.
Debugging with watch windows and locals and breakpoints is what .NET developers think is natural is given. Now also to DBA.
7. Security Enhancements
Bruce Schneier’s “Applied Cryptography” is my desk book, and that explains why I consider security enhancements in SQL Server 2012 to be significant and important.
In SQL Server 2008 and up, when you need to encrypt particular data, say, a credit card number or social security number, you need to generate AES key and protect that key by either a password, a certificate or a database master key. Everything was cool, excerpt that database master key was Triple-DES, which is strong but relatively slow. Anyway, not that slow to become a bottleneck for your application. But it’s like having biometric keys on all doors excerpt on the door to the server room, where biometric system stores it’s data.
So, since SQL Server 2012, the database and service master keys are stored using AES algorithm. The maximum certificate length was increased to 4096 bits.
When you need to store user passwords, usually you are using hashes of that passwords, so neither you or anyone who hacks into your database could see or guess the original password. For that we are using T-SQL function HASHBYTES, which is very simple to use and it can use a variety of algorithms – the stronger the slower. The fastest one now is MD5, which was hacked and therefore prohibited by some security standards, and various SHA algorithms of different length. So, now it also supports SHA2 with lengths of 256 and 512 bits. Hooray.
Database and server level passwords are now hashed using SHA512, which adds to the security of contained database.
The RC4 algorithm is deprecated and can only be used for databases with compatibility level of 90 or 100. You can’t use it in new databases.
Auditing is now enabled for all versions of SQL Server, but particular database auditing is only for Enterprise version and up.
8. Point-In-Time Restore
The last thing in this list is much simplified restoration of the database. Remember the traditional way of building the whole algorithm and restore plan, with full backup, differential, transaction log and tail-log backups? Well, they are still valid, of course, but when the time comes to restore your database – there are two new features that will greatly simplify your life, even though they are simple dialog boxes for a more complex T-SQL queries.
Point-In-Time Restore allows you to specify the moment in time, to which you want to restore your database. SQL Server will use all available backup sets to do it automatically.
Another thing, that becomes quite useful when using the AlwaysOn availability group, is Page Restore. This method is only good when there are very few pages to be restored, otherwise you may want to restore the whole files or the database.
More about Point-In-Time recovery in MSDN
More about Page Restore in MSDN