Common Language Runtime from a DBA point of view

When Common Language Runtime (CLR) was introduced in SQL Server 2005 it was enthusiastically received by developers. CLR gave them a powerful new way to write database code using the same modern, object-oriented languages they used for other application development.

From the DBA point of view, CLR blurred the distinction between compiled application code and database code. It raised questions about the roles of developer and DBA in the deployment, management and security of CLR code modules. To some extent it also blurred the distinction between the business logic tier and the data access tier in n-tier development models.

For many organizations, implementing CLR requires decisions about the development process and the roles of those involved. The purpose of this article is to offer enough information about CLR to facilitate those decisions without immersing the reader in too much technical detail.

We begin by offering a brief overview of CLR and how it is implemented. Then we briefly discuss the pros and cons of using CLR code. We describe situations where it might be helpful and situations where it might not be appropriate. Finally we discuss, in more detail, the new wrinkles it has brought to SQL Server management and security.

CLR from 5000 feet

CLR is a programming model that allows developers to write code using compiled, object-oriented languages and have the code executed as if it were a TSQL procedure, function, trigger, etc. Briefly, the process goes like this:

A developer creates code in a Visual Studio project. The code may be written in either C# or VB.NET.
The code is compiled and deployed from Visual Studio directly into the target database. The compilation process creates a dynamic link library (DLL). The DLL is an executable file containing the compiled code. Deployment also inserts information about the assembly into sys.assemblies and other related views in the target database.
One or more TSQL objects are created in the target database using the EXTERNAL NAME option to point to a code module in the assembly. These objects are referred to as TSQL prototypes. A prototype can be defined as a stored procedure, user-defined function, trigger, user-defined aggregate or user-defined type. The prototype contains no code. It is simply an entry point that can be called from other database code. When it is called, the compiled code in the DLL is executed.

Normally, all of these steps are taken care of when you compile and deploy the code from a Visual Studio project, but it is also possible to execute each of the steps manually.

Why Use CLR?

Modern object oriented languages have functionality that is difficult or impossible to reproduce using TSQL. CLR offers a way to use the rich feature set of these languages in database code. CLR is often better suited for computational or recursive operations than is TSQL.

CLR code executes in the same memory context as SQL Server, so it runs very efficiently.

What are potential problems with CLR?

CLR code is managed code, meaning that it runs inside a CLR virtual machine that prevents execution of any code with potential to destabilize the server. While this is a good thing, any form of virtualization has some degree of overhead. In some cases it can have a noticeable effect on performance.

CLR can be over-used in ways that can affect performance. Although compiled code normally executes efficiently within the SQL memory space, SQL is still the best option for heavy lifting. If a procedure accesses or returns large amounts of data, it is probably not a good candidate for CLR.

An oversimplified rule of thumb is that SQL handles set-based operations more efficiently and CLR handles iterative, procedural or computational operations more efficiently. There are many exceptions to that rule however.

Managing and Securing CLR

Sometimes the first contact a DBA has with CLR is a request to enable CLR on the server. That is a simple as flipping the �clr enabled� server configuration option. However there are many issues to consider before the switch is flipped.

Who has the right to deploy CLR?

Deploying code into the database has traditionally been the responsibility of the DBA. It has been his/her responsibility to make sure the code is working properly and will not cause performance or other problems on the server. However, CLR code is normally deployed into the database directly from the Visual Studio environment where it was developed. Visual Studio has not traditionally been a tool used by the DBA. This suggests that the developer might be in a better position than the DBA to deploy CLR code.

This is something that has to be worked out in each individual shop. There are many database administrators who would be uncomfortable with a developer having unfettered ability to create compiled code modules that run in the same memory space as SQL Server itself. However, the DBA still has very granular control over who can load CLR assemblies and at which security level the assemblies can function.

Securing CLR

By default, only members of the sysadmin server role, the db_owner and ddl_admin database roles have permission to execute assembly-related DDL statements. The permission can be granted to other users and roles. In addition, there are three levels of permissions that can be assigned to the assembly itself.

CLR Permission sets.

Safe

Only local data access and internal computation are allowed by the SAFE permission set. Access to external system resources such as files, network, environment variables or the registry is not permitted.

Anyone with the permission to create an assembly can create a SAFE assembly. It is the default when the permission level is not explicitly stated in the CREATE ASSEMBLY statement.

External Access

This permission set grants the right to access external system resources such as files, networks, environmental variables and the registry. Creating an assembly with this permission requires that the user have the EXTERNAL ACCESS permission.

Unsafe

Think twice before allowing UNSAFE assemblies on your server. They have essentially the same permissions as EXTERNAL ACCESS but are free from many restrictions on the code they may run. An UNSAFE assembly can call native code. It can perform illegal operations that can threaten the stability of the server.

Only a member of the sysadmins group can grant UNSAFE permissions to an assembly.

Impersonation

By default, CLR code executes in the security context of the SQL Server service account. This presents a problem when accessing resources external to SQL Server since the best practice is to have a minimal set of permissions for the service account. However, it is possible to execute CLR code by impersonating another Windows account.

CLR Metadata

In a CLR-enabled environment the DBA should be familiar with five system views that contain metadata for the CLR objects in the database. Here is a short description of each with suggestions of how each might be used.

sys.assemblies

This view contains one record for every assembly in the database. It contains information about the assembly’s name, security level, creation date, etc

sys.assembly_files

This table contains (or should contain) a row for every source code file, AssemblyInfo file, and the binary source for the dll file for each assembly in the database. If the assembly is deployed from a Visual Studio project, all the files will be added automatically. If the assembly is deployed manually the source code and AssemblyInfo files will have to be also added manually.

sys.assembly_files can provide a measure of source code control. The varbinary [content] column in this view contains the clr source for all code modules and the binary code of the assembly DLL. This means that your CLR source code and the binary contents of the compiled DLL are backed up each time the database is backed up.

You can easily retrieve the text contents of source code files from this view. First set your query window to return results as text. Then execute the following code to return all the C# source code for the specified assembly:

USE MyDatabase;
SET NOCOUNT ON
SELECT CAST(content as varchar(max))
FROM sys.assembly_files f
JOIN sys.assemblies a
ON a.assembly_id = f.assembly_id
WHERE a.name = ‘MyAssembly’
and f.name LIKE ‘%.cs’;

sys.assembly_modules

An assembly can include multiple classes (for example procedure, function, etc). Classes within the assembly can have multiple code modules (methods) that implement different functionality. This view maps the IDs of the individual modules to the object ids of the TSQL prototypes that reference them. To view this data with the IDs resolved to names, use this query:

SELECT OBJECT_NAME(m.object_id) as db_object
,a.name as assembly
,m.assembly_class
,m.assembly_method
FROM sys.assembly_modules m
JOIN sys.assemblies a ON a.assembly_id = m.assembly_id;

You might notice in this view that a module can be mapped to more than one prototype. For example, a single code module might be implemented as a stored procedure and as a function.

sys.assembly_references

An assembly can reference another assembly. This view contains a row for each pair of assemblies where one is directly referencing another. Use this view to find any dependencies before dropping an assembly.

sys.module_assembly_usages

This view maps assembly names to the object ids of the SQL prototypes that reference them. Use this query to resolve the object ids to object names:

USE MyDB:
SELECT OBJECT_NAME(object_id) as db_object,a.name as assembly
FROM sys.module_assembly_usages u
JOIN sys.assemblies a ON a.assembly_id = u.assembly_id

Removing CLR Assemblies

Visual Studio handles the details of creating and deploying CLR code to the SQL Server database, but there is no equivalent tool to uninstall a CLR assembly. You must drop objects manually in the opposite order than they were created.

First, all TSQL prototypes that were created must be dropped. Use sys.module_assembly_usages to find all the TSQL prototypes associated with an assembly. Before dropping a TSQL prototype, you should make sure that no other TSQL objects are dependent on it. The simplest way to do that is to right-click the prototype in the Management Studio object browser then select View Dependencies. You can also use sp_depends if you prefer writing code.

EXEC sp_depends @objname = N’sp_MyProc’;

After all prototypes are dropped, drop the assembly itself with the Drop Assembly command. This will clean up all references to it in the assembly views.

Conclusion

The SQL-only era of database programming is changing. CLR is just one example. Although crusty old SQL guys like me might grumble a bit, change is what makes life as a DBA interesting.

CLR offers real advantages in areas where SQL itself has limitations. The ability to mix the two programming models is a powerful tool and DBAs as well as developers need to acquire the necessary skills if they do not already have them.

Additional menu