I wrote this article last month for HeadSpring but couldn't announce it here. This is about validation using Entity Framework DbContext API. I have discussed various options to validate the entities and when it makes sense to use a particular option. You can find the article here:
http://www.headspring.com/2012/04/entity-framework-code-first-dbcontext-validation
Showing posts with label entity framework. Show all posts
Showing posts with label entity framework. Show all posts
Monday, May 28, 2012
Entity Framework Code First DbContext - Validation
Wednesday, March 14, 2012
Entity Framework Code First - Visualizing Generated Model & Database
Aren't we always curious about finding out how model is generated from the entities we have defined in Code First. Since there are so many conventions involved, it always seems difficult to go to the database and see individual tables. Well, there are different ways which could make our life easier to understand the database generated and even the resulting model by the code first entities. Let's discuss about them.
This discussion would discuss how we can get a complete picture of the related entities in a Context and resulting database without. As they say, a picture is worth a thousand words. So, definitely, this would make our life easier to be managing those entities, their relationship and resulting databases.
Most SQL Client tools allow creation of diagrams for existing tables. We just need to reverse engineer the database and select the tables we need. In the diagram, we can see the details of the selected tables. We can also see the foreign key relationships between them. SQL Server Management Studio also has such support. Just right click the Database Diagrams folder under your database in Object Explorer and select New Database Diagram.
You should see following dialog providing the list of all the tables in the database. Here you can select all the tables that you want to include in the diagram and click Add button. For our database, we should see the following list of tables.
As we hit the button, a new database diagram is created and shown. It has the tables as selected in the previous step. It also has the foreign key relationship details between the selected tables.
Visualizing Model using Entity Framework Power Tools
We can also visualize the expected model during design time using Entity Framework Power Tools. It is a Visual Studio Extension which can be installed using online extension library download feature of Visual Studio as follows:
After installation, if you select a class file in the solution explorer the following context menu is displayed allowing the view of the expected model.
If the selected file contains a sub-class of DbContext then it generates a read-only model listing all the entities as expected to be generated at run-time.
Persisting Generated model as *.edmx file
We can also support persisting the generated model at run-time. Entity Framework ... has added a new type to the framework just to support this requirement. This is called EdmxWriter. It can use an XmlWriter to persist the generated model as in the below code:
We can view this file in Entity Designer in Visual Studio.
This discussion would discuss how we can get a complete picture of the related entities in a Context and resulting database without. As they say, a picture is worth a thousand words. So, definitely, this would make our life easier to be managing those entities, their relationship and resulting databases.
- Database Diagrams
- Viewing Entity Data Model using Visual Studio Integration
- Generating model using EdmxWriter
Most SQL Client tools allow creation of diagrams for existing tables. We just need to reverse engineer the database and select the tables we need. In the diagram, we can see the details of the selected tables. We can also see the foreign key relationships between them. SQL Server Management Studio also has such support. Just right click the Database Diagrams folder under your database in Object Explorer and select New Database Diagram.
You should see following dialog providing the list of all the tables in the database. Here you can select all the tables that you want to include in the diagram and click Add button. For our database, we should see the following list of tables.
As we hit the button, a new database diagram is created and shown. It has the tables as selected in the previous step. It also has the foreign key relationship details between the selected tables.
Visualizing Model using Entity Framework Power Tools
We can also visualize the expected model during design time using Entity Framework Power Tools. It is a Visual Studio Extension which can be installed using online extension library download feature of Visual Studio as follows:
After installation, if you select a class file in the solution explorer the following context menu is displayed allowing the view of the expected model.
If the selected file contains a sub-class of DbContext then it generates a read-only model listing all the entities as expected to be generated at run-time.
Persisting Generated model as *.edmx file
We can also support persisting the generated model at run-time. Entity Framework ... has added a new type to the framework just to support this requirement. This is called EdmxWriter. It can use an XmlWriter to persist the generated model as in the below code:
using (var context = new InstituteEntities()) { XmlWriterSettings settings = new XmlWriterSettings(); settings.Indent = true; using (XmlWriter writer = XmlWriter.Create(@"Model.edmx", settings)) { EdmxWriter.WriteEdmx(context, writer); } }Here XmlWriter and XmlWriterSettings are from System.Xml namespace. You would be needing to import the namespace in the class file before using them. After the above code is executed, we should see Model.edmx file created in the output directory.
We can view this file in Entity Designer in Visual Studio.
Saturday, March 10, 2012
Entity Framework Code First - Connection with Database
This post of part of a series of posts in which we are discussing various features of entity framework code first. We have been building on an Institute project and adding stuff to it based on the feature discussed. You might have realized that we have never shown you the actual connection string, yet we have been interacting with database for persistence of the institute entities. In this post we will be discussing how we are doing that without even specifying the connection string. We will also be discussing how we can customize this default behavior.
Building Connection String:
As we have discussed, a big part of entity framework code first is learning the various conventions it uses. It also has such conventions while interacting with database. The first convention is about finding connection string from the app.config and building it if one is not found there.
Which Connection String to use from config?
After building the conceptual model, Entity framework Code first API tries to locate the database. In order to connect to it, it needs the connection string. For this purpose, it uses the name of DbContext or a suggestion from DbContext. It can use either of the type name or the fully qualified name of DbContext. Let's use this connection string in the app.config.
Let's see what happens if we use the fully qualified name of DbContext as name of our connection string in the app.config.
If you want the cat to really be killed then you must be curious to know what if we use both of them in app.config. Although one would never use this in a real scenario but it really doesn't hurt in finding out. Belive me, "Ignorance is not a bliss". Let's remove the database from the SQL Server instance. Let's first remove the already existing database.
Now we run the application. Since the framework can not decide between which connection string to use, it throws an exception as follows:
Let's see the details of the exception. The framework really got confused to build the connection and resulted in the exception.
As we start debugging we realize the following exception while construction of DbContext instance.
No Connection string found in config?
Now we discuss what if there is no connection string found in the app.config based on the criteria used by EF Code first. The framework still doesn't give up. Now it assumes that there is a SQL Server instance running on the local machine with name SQLExpress. This is the default instance name when SQL Server Express is installed. This is fairly good assumption that the Microsoft platform developers would have SQL Server Express installed on their machines and it would be installed with the default instance name as we (developers) are also fairly lazy to change the default options in installers. Let's comment out the connection strings in app.config and see what happens. When we run the application, it seems that the DbContext is instantiated successfully. Basically, based on the SQL Server Express instance, it has automatically created a database for us using the fully qualified name of DbContext specialization type.
Suggesting Connection String Name:
As we discussed in previous discussion that the framework uses the DbContext type name to look up the connection string in app.config. So we must be keeping the name of the connection string to be the same as the type name of the DbContext. What if we want to keep a specific name. We might be following different naming conventions for type names and configs. They might also be managed by different teams altogether. In this case, how can we help the framework to still be able to use the specific connection string?
Basically DbContext does support that, we just need to use the relevant DbContext constructor when it is being initialized. Let's update the constructor of InstituteEntities to be use the particular base type constructor.
Suggesting Database name but still using default configuration:
What if we don't want to specify the app.config but we still want to use the specific database name. The framework also supports that. As a matter of fact, the same DbContext constructor is used for that. The string parameter can be either the connection string name in the app.config. If not found, then it would be creating a database for us in the default SQL Server Express instance on local machine. Let's look at the documentation of the particular constructor of DbContext.
Before going any further, just comment the connection string in app.config. You can also delete the particular database in the SQL Server instance.Let's update the constructor to specify the Database name instead.
Using Encrypted Connection String
The world is not so simple. There are extra curious people who want to see how we have done what we have done. They want to do it for fun or monetary or political gains. We want to keep our systems protected from their curious nature. The easiest option is to keep the connection string encrypted so, even if, they are able to access the app.config, the could not find the connection details from the connection string.
Now this requirement would push us how the entity framework resolves the connection string.
Using pre-instantiated DbConnection to construct DbContext
DbContext provides us with various constructors. As we know we can specify what base class constructor to pick when a particular sub-type constructor is used for instantiating the sub-type. We can use the DbContext constructor which allows us to pass the pre-instantiated DbConnection to be used for DbContext initialization. The below code is using the same approach.
EF Code First Provider Model
Like ADO.net, Entity Framework Code First uses provider model for database management systems. It provides a certain framework for third party library developers to develop the providers that it can use to work with a particular database. There is default provider for SQL Server available in the framework. For working with other database, you would need to download the specific provider libraries and use them in your code instead. Let's not waste any time on it and use the default provider as we still want to use SQL Server. We just want to modify this so that it could use the encrypted connection string.
As we discussed above, the providers must be providing the implementation of various abstractions which we can use in our code. One such type is about connection factories. As its name implies, this would be used to create connection for the particular Database Management System [DBMS].
We need to inform the EF Code first framework to use the new database connection factory instead. This is done in application initialization code before DbContext is even initialized.
Download Code
Building Connection String:
As we have discussed, a big part of entity framework code first is learning the various conventions it uses. It also has such conventions while interacting with database. The first convention is about finding connection string from the app.config and building it if one is not found there.
Which Connection String to use from config?
After building the conceptual model, Entity framework Code first API tries to locate the database. In order to connect to it, it needs the connection string. For this purpose, it uses the name of DbContext or a suggestion from DbContext. It can use either of the type name or the fully qualified name of DbContext. Let's use this connection string in the app.config.
<connectionStrings> <add name="InstituteEntities" providerName="System.Data.SqlClient" connectionString="Server=.\SQLExpress; Trusted_Connection=true; Database=EFCodeFirstDatabaseCreation.Entities.InstituteEntities"/> </connectionStrings>This would result in connecting to EFCodeFirstDatabaseCreation.Entities.InstituteEntities database on the local system's instance name SQLExpress. Since System.Data.SqlClient provider is used then it must be a SQL Server instance.
Let's see what happens if we use the fully qualified name of DbContext as name of our connection string in the app.config.
<connectionStrings> <add name="EFCodeFirstDatabaseCreation.Entities.InstituteEntities" providerName="System.Data.SqlClient" connectionString="Server=.\SQLExpress; Trusted_Connection=true; Database=EFCodeFirstDatabaseCreation.Entities.InstituteEntities"/> </connectionStrings>This would result in the connection as follows:
If you want the cat to really be killed then you must be curious to know what if we use both of them in app.config. Although one would never use this in a real scenario but it really doesn't hurt in finding out. Belive me, "Ignorance is not a bliss". Let's remove the database from the SQL Server instance. Let's first remove the already existing database.
Now we run the application. Since the framework can not decide between which connection string to use, it throws an exception as follows:
Let's see the details of the exception. The framework really got confused to build the connection and resulted in the exception.
As we start debugging we realize the following exception while construction of DbContext instance.
No Connection string found in config?
Now we discuss what if there is no connection string found in the app.config based on the criteria used by EF Code first. The framework still doesn't give up. Now it assumes that there is a SQL Server instance running on the local machine with name SQLExpress. This is the default instance name when SQL Server Express is installed. This is fairly good assumption that the Microsoft platform developers would have SQL Server Express installed on their machines and it would be installed with the default instance name as we (developers) are also fairly lazy to change the default options in installers. Let's comment out the connection strings in app.config and see what happens. When we run the application, it seems that the DbContext is instantiated successfully. Basically, based on the SQL Server Express instance, it has automatically created a database for us using the fully qualified name of DbContext specialization type.
Suggesting Connection String Name:
As we discussed in previous discussion that the framework uses the DbContext type name to look up the connection string in app.config. So we must be keeping the name of the connection string to be the same as the type name of the DbContext. What if we want to keep a specific name. We might be following different naming conventions for type names and configs. They might also be managed by different teams altogether. In this case, how can we help the framework to still be able to use the specific connection string?
Basically DbContext does support that, we just need to use the relevant DbContext constructor when it is being initialized. Let's update the constructor of InstituteEntities to be use the particular base type constructor.
public InstituteEntities() : base("instituteConnectionString") { this.Configuration.ProxyCreationEnabled = true; this.Configuration.AutoDetectChangesEnabled = true; }Now the default constructor of InstituteEntities would be using the specific constructor of the base class to specify the connection string name in app.config. Let's update the app.config and add the expected connection string as follows:
<connectionStrings> <add name="instituteConnectionString" providerName="System.Data.SqlClient" connectionString="Server=.\SQLExpress; Trusted_Connection=true; Database=SuggestedInstituteDatabase"/> </connectionStrings>If the framework would use the above connection string then it would look for a database named SuggestedInstituteDatabase. If the database is not found then it would create it for us. Let's run the application now. There are no exceptions. Now go to the SQL Server Management Studio and refresh the Databases. That's right, it has created the expected database for us.
Suggesting Database name but still using default configuration:
What if we don't want to specify the app.config but we still want to use the specific database name. The framework also supports that. As a matter of fact, the same DbContext constructor is used for that. The string parameter can be either the connection string name in the app.config. If not found, then it would be creating a database for us in the default SQL Server Express instance on local machine. Let's look at the documentation of the particular constructor of DbContext.
Before going any further, just comment the connection string in app.config. You can also delete the particular database in the SQL Server instance.Let's update the constructor to specify the Database name instead.
public InstituteEntities() : base("SuggestedInstituteDatabase") { this.Configuration.ProxyCreationEnabled = true; this.Configuration.AutoDetectChangesEnabled = true; }Now run the application and see the created database in SQL Server Management Studio. [If you are already running SQL Server Management Studio then you would need to refresh the databases.
Using Encrypted Connection String
The world is not so simple. There are extra curious people who want to see how we have done what we have done. They want to do it for fun or monetary or political gains. We want to keep our systems protected from their curious nature. The easiest option is to keep the connection string encrypted so, even if, they are able to access the app.config, the could not find the connection details from the connection string.
Now this requirement would push us how the entity framework resolves the connection string.
Using pre-instantiated DbConnection to construct DbContext
DbContext provides us with various constructors. As we know we can specify what base class constructor to pick when a particular sub-type constructor is used for instantiating the sub-type. We can use the DbContext constructor which allows us to pass the pre-instantiated DbConnection to be used for DbContext initialization. The below code is using the same approach.
public InstituteEntities() : base(new SqlConnection(Constants.ConnectionString), false) { //DbContext sub-type constructor initialization }Since we are using SQL Server database to hold the entities, we are using SqlConnection instance to be passed to DbContext constructor. We need to specify what connection string to use for this DbConnection. The above code is getting the connection string from static member ConnectionString from Constants class. Now it is up to us how we want to load / build the value of ConnectionString.
public static class Constants { public static string ConnectionString { get { return GetDecryptedConnectionString(); } } private static string GetDecryptedConnectionString() { return @"Server=.\SQLExpress;Trusted_Connection=true;Database=SuggestedInstituteDatabase"; } }In the above code, ConnectionString is getting the value from GetDecryptedConnectionString() method. This is here, in this method, that we can load / build the connection string. If this is encrypted then we can decrypt it and return to the calling code. In order to test this code, let's delete the SuggestedInstituteDatabase from the local SQL Server instance. When we run the application, the database should be created automatically. This is created by EF Code First using the connection details passed in the constructor.
EF Code First Provider Model
Like ADO.net, Entity Framework Code First uses provider model for database management systems. It provides a certain framework for third party library developers to develop the providers that it can use to work with a particular database. There is default provider for SQL Server available in the framework. For working with other database, you would need to download the specific provider libraries and use them in your code instead. Let's not waste any time on it and use the default provider as we still want to use SQL Server. We just want to modify this so that it could use the encrypted connection string.
As we discussed above, the providers must be providing the implementation of various abstractions which we can use in our code. One such type is about connection factories. As its name implies, this would be used to create connection for the particular Database Management System [DBMS].
class EncryptedIDbConnectionFactory : IDbConnectionFactory { #region Private Fields IDbConnectionFactory _connectionFactory; #endregion #region Constructors public EncryptedIDbConnectionFactory(IDbConnectionFactory dbConnectionFactory) { if (dbConnectionFactory == null) { throw new ArgumentNullException("dbConnectionFactory can not be null"); } _connectionFactory = dbConnectionFactory; } #endregion #region IDbConnectionFactory implementation public DbConnection CreateConnection(string nameOrConnectionString) { //decryption of connection string string decryptedConnectionString = GetDecryptedConnectionString(nameOrConnectionString); return _connectionFactory.CreateConnection(decryptedConnectionString); } #endregion #region Private Methods private string GetDecryptedConnectionString(string nameOrConnectionString) { //use some encryption library to decrypt return nameOrConnectionString; } #endregion }Using connection factory like this would still use the functionality of the existing factory to create database connection. We are just providing a functionality that the connection string passed should be decrypted before it even goes to build the connection. It is the implementation of Decorator design pattern. Here you can use an encryption library to decrypt the connection string in GetDecryptedConnectionString method. For keeping the example simple, we are just returning the same string back to the calling code. Consider this your assignment to be using the appropriate encryption library.
We need to inform the EF Code first framework to use the new database connection factory instead. This is done in application initialization code before DbContext is even initialized.
Database.DefaultConnectionFactory = new EncryptedIDbConnectionFactory(Database.DefaultConnectionFactory);Now how is this approach different than the previous one? Basically, in this approach the construction of DbConnection is still delegated to the DefaultConnectionFactory. We are just helping it resolve the connection string encryption as the default factory expects it unencrypted. It can also use all the possible conventions based on Entity Framework Code First. This approach is also very extensible to add various features to the connection factory without actually extending it. Now it is up to the requirement which particular feature we want, we can decorate the factory with those additional decorators and Zindabad!!! Like we can add tracing or caching decorators and use it with the default connection factory. Now when we run the application, we should be seeing the exact database created as the previous example.
Download Code
Labels:
.net,
.net 4.5,
C#,
entity framework,
entity framework code first
Monday, March 5, 2012
Entity Framework Code First - Change Tracking
In this post we will be discussing about change tracking feature of Entity Framework Code First. Change tracking allows Entity framework to keep track of all the changes in entities' data. It might involve adding new entities to entities collection or modifying or removing existing entities. These changes are kept at DbContext level. All changes are lost if they are not saved before destroying the DbContext instance.
By default, Entity framework Code First registers all changes as the occur. When it's time to save those changes it just looks at this information and updates the database tables based on this registered information. Additionally, it keeps a snapshot of entities as they are loaded from Database or when they were last saved to the Database. This snapshot of entities and automatic change tracking are used to push the entities changes to the database.
Enabling Change Tracking
Automatic Change tracking is enabled by default. Disabling it would not trigger the DbContext update for each change in the entity. There are specific instances when DbContext would update the DbContext with the changes. This includes explicit call to SaveChanges() method of DbContext. Change tracking can be enabled / disabled by setting AutoDetectChangesEnabled to true / false respectively for DbContext.
Change Tracking Proxies Vs Snapshot Change Tracking
Entity Framework creates a snaphot of all entities data when they are loaded from Database. When it needs to save these entities to the database, it compares this snapshot of all entities to their current state. It then updates the database based on the state of these entities. It might add, update or delete the entities based on their states. Saving entities is the implementation of Unit of Work (UoW) pattern described by Martin Fowler. Since it uses the states of entities to perform the specific CRUD [CReate, Update, Delete] operation, the state must be updated before the entities are saved. In order to save developers, the EF implcitly updates the states before saving them, if required. This might be very costly if there are many changes to the entities. In order to optimize this, it is better to call DbContext.DetectChanges() when it is safe to do so. After pushing these changes to the database, another snapshot is taken which is used as OriginalValues of these entities.
There are also other instances when DbContext.DetectChanges() is implicitly called by the framework. According to MSDN, they are as follows:
Entity States
An entity goes through various states throughout its lifetime. It is based on various operations performed to change its properties. The operations on DbContext can also result in updating the state of an entity. It can roughly be presented in a state diagram as follows:
These are the main state transitions which you might expect in realistic scnenarios. There are a few other state transitions too which are not very realistic e.g. If an entity is added using DbSet.Add() and the entity is already existing then the state of the entity changes to Added. An entity is an object and it is semantically wrong. In a typical implementation, one would expect an exception in this case. Change tracking keeps an entity's state always updated. If we modify any property of an entity its state changes to Modified.
Let's see the effect of enabling / disabling the auto-detection of changes of entities. Let's execute the same code block under both conditions and see the effect of setting the auto-detection to an appropriate value. Here we are updating the DepartmentName of the first department found in the Departments entities collection and setting it to an updated value. We are using DbContext to get the original and updated value of the particular Department entity. Then we are printing the original and updated value of the Department entity along with the state information of these entities. The generic DbContext.Entry() method lets us get the DbEntityEntry object for a particular entity. This is the central type in DbContext API. Once we get it, we can access OriginalValues and CurrentValues properties. Both of them are of type DbPropertyValues. It is a collection of all the properties of an underlying entity or a complex object. Now we can use GetValue<T> method to get the value of any property by providing its name.
Here we have enabled auto detection. As you can see updating Department's name updates the state of the entity as Modified.
On the contrary, if we disable change tracking the state of an entity is not updated until it is caused. We can do that by calling DbContext.DetectChanges(). It is also implicitly called by EF when SaveChanges() is called. In the following example, we are disabling change tracking. You can see that, although, we can still see old and new values of different properties of an entity, it keeps its state as Unchanged.
Note: As you can notice that, although the state of the entity is affected by updating the auto-detection but it's data seems to be reflecting the changes. This is due to the implicit call to DbContext.DetectChanges() by DbContext.Entry() as discussed above.
Capturing Data Changes in Code
In order to capture any changes with the data, Entity Framework has provided a comprehensive API around it. A number of types are provided. They are mostly in System.Data.Entity.Infrastructure namespace in EntityFramework assembly. The main type that you would mostly be needing is DbEntityEntry. It is available in both generic and non-generic flavors. Using the generic version saves us from a lot of typecasting later in the code. We can get the DbEntityEntry for any entry. It allows us to check the original values of the properties of an entity. It also provides the state of entity. As we discussed above that this state would depend on a number of factors i.e. wheter the change tracking is enabled or DbContext.DetectChanges() has been called. Now how to get DbEntityEntry for a particular entity. We seem to have a few options here.
Using DbContext.Entry<TEntity>
We use this we are already holding the entity object and we need to obtain information about entity. In the following example we are getting an arbitrary department from the DbContext. We are updating the value of DepartmentName property. DbEntityEntry allows us to get the old and new values of an entity's property as DbPropertyValues. This can be used to get the value of any property using its GetValue method. As you have seen above:
Using DbContext.ChangeTracker.Entries<TEntity>
This is another option to get the DbEntityEntry for an entity. We generally use this when we are not holding on the the actual entity object. This also has its generic and non-generic versions but they are operationally different. The generic version provides DbEntityEntry objects for entities of a given type tracked by the DbContext. On the other hand the non-generic version provides them for ALL entities tracked by the system, which might not be needed most of the time. Additionally, the generic version returns the generic version of DbEntityEntry and non-generic version returns the non-generic one. In the following example, we are practically doing the same thing as the above example but we are getting the entries using DbContext.ChangeTracker. It is an IEnumerable so we can use an Iterator to play with individual values.
Change Tracking for Individual members of an Entity
In the above code, we saw how we can get the DbEntityEntry with current and old values. Code first lets us more fine grained interfaces through which we can capture individual properties of an entity. There are different types provided for this. All of these classes inherit from DbMemberEntry<TEntity>. It is an abstract class. All scalar properties of an entity use DbPropertyEntry. The complex properties uses futher specialized version of this and it uses DbComplexPropertyEntry. DbReferenceEntry and DbCollectionEntry are used by reference and collection based navigation properties respectively. We can present the relationship between these types as follows:
DbEntityEntry Vs ObjectStateEntry
Before there were humans in this planet we call earth, there used to be a completely different world. There used to be a different creatures, some of them we call as Dinosaurs. But somehow they got destroyed and a new world was created. New species came into being. Even if we don't indulge into the debate of evolution Vs Intelligent design then there has been continuous evolution of human thoughts. We are learning about various hidden secrets of the universe and helping others learn. Similarly, before DbContext API, there used to be ObjectContext API. For some reason the all-knowing Creator of the framework [Microsoft] decided to get rid of the older, more complicated ObjectContext world and hence new DbContext API was born. This is definitely an intelligent design but the new species are also evolving and you see different versions like 4.2, 4.3 and so on. The problem is that they can co-exist. There has been a door to go back in time using IObjectContextAdapter to get ObjectContext through DbContext and here the fun begins. Now we can use the features of both the world in the same space and time. ObjectContext API used to have a similar feature like DbEntityEntry, it was ObjectStateEntry. Don't be surprised if you find them in older code or see them co-exist with new DbContext API. But don't use them in any new code after you have release version of tools supporting DbContext API available unless there is any legitimate reason.
Complex Type and Change Tracking
Like lazy loading, change tracking is also not supported for complex types. This is because the complex types are not real entities. They are syntactic sugar to group a number of fields. Each member of a complex type would be created as a column in the main entity. It has no primary key. The complex type based property is itself considered for change tracking. So if we assign a new instance to a complex type reference, it is tracked automatically. Let's make changes to CourseLocation and LocationAddress to support proxy creation. [http://msdn.microsoft.com/en-us/library/dd468057.aspx]
Let's make some changes in the properties of a complex type for an entity and see how this affects the state of the entity itself. We can not check the state of a complex type directly using DbContext.Entries() or DbChangeTracker.Entries() as DbContext.ChangeTracker.Entries() .
Interesting...But isn't that contrary to what we just discussed? i.e. there is no automatic change tracking for complex type. Yes, that is also true. But how would we justify this output with this statement. Let me explain. Basically, the state update that we see for CourseLocation entity after updating the values of LocationAddress is not because of automatic change tracking. It is because of using DbContext.ChangeTracker.Entries to get the entity's states. It also has an implicit call to DbContext.DetectChanges() as we discussed above. So, snapshot change has been pushed which has resulted in this behavior. Let's verify this with ObjectContext API to see if that is the case. We update the same method as follows:
We can also use ComplexProperty method of DbEntityEntry to get the changes in complex property.
Download Code:
By default, Entity framework Code First registers all changes as the occur. When it's time to save those changes it just looks at this information and updates the database tables based on this registered information. Additionally, it keeps a snapshot of entities as they are loaded from Database or when they were last saved to the Database. This snapshot of entities and automatic change tracking are used to push the entities changes to the database.
Enabling Change Tracking
Automatic Change tracking is enabled by default. Disabling it would not trigger the DbContext update for each change in the entity. There are specific instances when DbContext would update the DbContext with the changes. This includes explicit call to SaveChanges() method of DbContext. Change tracking can be enabled / disabled by setting AutoDetectChangesEnabled to true / false respectively for DbContext.
public InstituteEntities() { this.Configuration.AutoDetectChangesEnabled = true; }This ensures that all the changes involved with any entity in the context are tracked by the framework. The framework maintains the state of entities. It uses this to determine the changes needed to be pushed to database when SaveChanges() is called. Disabling change tracking would still allow us to check the old and current values of an entity but it would keep the state as UnChanged until the changes are detected. We need to manually call DetectChanges() on DbContext to update them. There are instances in which it is called implicitly by Entity Framework API. Since it would check all the entities of all types, for which change tracking is enabled, to verify if they have any changes in their data. If so, it changes their state as Modified. It helps the framework to push all the entities with Added, Modified and Detached states when SaveChanges() is called on DbContext. Here SaveChanges() is an implementation of UoW (Unit of Work) pattern descibed by Martin Fowler.
Change Tracking Proxies Vs Snapshot Change Tracking
Entity Framework creates a snaphot of all entities data when they are loaded from Database. When it needs to save these entities to the database, it compares this snapshot of all entities to their current state. It then updates the database based on the state of these entities. It might add, update or delete the entities based on their states. Saving entities is the implementation of Unit of Work (UoW) pattern described by Martin Fowler. Since it uses the states of entities to perform the specific CRUD [CReate, Update, Delete] operation, the state must be updated before the entities are saved. In order to save developers, the EF implcitly updates the states before saving them, if required. This might be very costly if there are many changes to the entities. In order to optimize this, it is better to call DbContext.DetectChanges() when it is safe to do so. After pushing these changes to the database, another snapshot is taken which is used as OriginalValues of these entities.
There are also other instances when DbContext.DetectChanges() is implicitly called by the framework. According to MSDN, they are as follows:
- The Add, Attach, Find, Local, or Remove members on DbSet
- The GetValidationErrors, Entry, or SaveChanges members on DbContext
- The Entries method on DbChangeTracker
Entity States
An entity goes through various states throughout its lifetime. It is based on various operations performed to change its properties. The operations on DbContext can also result in updating the state of an entity. It can roughly be presented in a state diagram as follows:
These are the main state transitions which you might expect in realistic scnenarios. There are a few other state transitions too which are not very realistic e.g. If an entity is added using DbSet.Add() and the entity is already existing then the state of the entity changes to Added. An entity is an object and it is semantically wrong. In a typical implementation, one would expect an exception in this case. Change tracking keeps an entity's state always updated. If we modify any property of an entity its state changes to Modified.
Let's see the effect of enabling / disabling the auto-detection of changes of entities. Let's execute the same code block under both conditions and see the effect of setting the auto-detection to an appropriate value. Here we are updating the DepartmentName of the first department found in the Departments entities collection and setting it to an updated value. We are using DbContext to get the original and updated value of the particular Department entity. Then we are printing the original and updated value of the Department entity along with the state information of these entities. The generic DbContext.Entry() method lets us get the DbEntityEntry object for a particular entity. This is the central type in DbContext API. Once we get it, we can access OriginalValues and CurrentValues properties. Both of them are of type DbPropertyValues. It is a collection of all the properties of an underlying entity or a complex object. Now we can use GetValue<T> method to get the value of any property by providing its name.
using (var context = new InstituteEntities()) { var department = context.Departments.First<Department>(); department.DepartmentName = "Computer & Information Systems Engineering"; DbEntityEntry<Department> departmentEntry = context.Entry<Department>(department); DbPropertyValues originalDepartment = departmentEntry.OriginalValues; DbPropertyValues updatedDepartment = departmentEntry.CurrentValues; EntityState state = context.Entry<Department>(department).State; Console.WriteLine( string.Format("State: {0}, Old Value: {1}, New Value: {2}", state, originalDepartment.GetValue<string>("DepartmentName"), updatedDepartment.GetValue<string>("DepartmentName"))); }
Here we have enabled auto detection. As you can see updating Department's name updates the state of the entity as Modified.
public InstituteEntities() { this.Configuration.AutoDetectChangesEnabled = true; }
On the contrary, if we disable change tracking the state of an entity is not updated until it is caused. We can do that by calling DbContext.DetectChanges(). It is also implicitly called by EF when SaveChanges() is called. In the following example, we are disabling change tracking. You can see that, although, we can still see old and new values of different properties of an entity, it keeps its state as Unchanged.
public InstituteEntities() { this.Configuration.AutoDetectChangesEnabled = false; }
Note: As you can notice that, although the state of the entity is affected by updating the auto-detection but it's data seems to be reflecting the changes. This is due to the implicit call to DbContext.DetectChanges() by DbContext.Entry() as discussed above.
Capturing Data Changes in Code
In order to capture any changes with the data, Entity Framework has provided a comprehensive API around it. A number of types are provided. They are mostly in System.Data.Entity.Infrastructure namespace in EntityFramework assembly. The main type that you would mostly be needing is DbEntityEntry. It is available in both generic and non-generic flavors. Using the generic version saves us from a lot of typecasting later in the code. We can get the DbEntityEntry for any entry. It allows us to check the original values of the properties of an entity. It also provides the state of entity. As we discussed above that this state would depend on a number of factors i.e. wheter the change tracking is enabled or DbContext.DetectChanges() has been called. Now how to get DbEntityEntry for a particular entity. We seem to have a few options here.
Using DbContext.Entry<TEntity>
We use this we are already holding the entity object and we need to obtain information about entity. In the following example we are getting an arbitrary department from the DbContext. We are updating the value of DepartmentName property. DbEntityEntry allows us to get the old and new values of an entity's property as DbPropertyValues. This can be used to get the value of any property using its GetValue method. As you have seen above:
using (var context = new InstituteEntities()) { var department = context.Departments.First<Department>(); department.DepartmentName = "Computer & Information Systems Engineering"; DbEntityEntry<Department> departmentEntry = context.Entry<Department>(department); DbPropertyValues originalDepartment = departmentEntry.OriginalValues; DbPropertyValues updatedDepartment = departmentEntry.CurrentValues; EntityState state = context.Entry<Department>(department).State; Console.WriteLine( string.Format("State: {0}, Old Value: {1}, New Value: {2}", state, originalDepartment.GetValue<string>("DepartmentName"), updatedDepartment.GetValue<string>("DepartmentName"))); }
Using DbContext.ChangeTracker.Entries<TEntity>
This is another option to get the DbEntityEntry for an entity. We generally use this when we are not holding on the the actual entity object. This also has its generic and non-generic versions but they are operationally different. The generic version provides DbEntityEntry objects for entities of a given type tracked by the DbContext. On the other hand the non-generic version provides them for ALL entities tracked by the system, which might not be needed most of the time. Additionally, the generic version returns the generic version of DbEntityEntry and non-generic version returns the non-generic one. In the following example, we are practically doing the same thing as the above example but we are getting the entries using DbContext.ChangeTracker. It is an IEnumerable so we can use an Iterator to play with individual values.
using (var context = new InstituteEntities()) { var department = context.Departments.First<Department>(); department.DepartmentName = "Computer & Information Systems Engineering"; DbChangeTracker changeTracker = context.ChangeTracker; IEnumerable<DbEntityEntry<Department>> departmentEntries = changeTracker.Entries<Department>(); foreach (DbEntityEntry departmentEntry in departmentEntries) { DbPropertyValues originalDepartment = departmentEntry.OriginalValues; DbPropertyValues updatedDepartment = departmentEntry.CurrentValues; EntityState state = context.Entry<Department>(department).State; Console.WriteLine( string.Format("State: {0}, Old Value: {1}, New Value: {2}", state, originalDepartment.GetValue<string>("DepartmentName"), updatedDepartment.GetValue<string>("DepartmentName"))); } }
Change Tracking for Individual members of an Entity
In the above code, we saw how we can get the DbEntityEntry with current and old values. Code first lets us more fine grained interfaces through which we can capture individual properties of an entity. There are different types provided for this. All of these classes inherit from DbMemberEntry<TEntity>. It is an abstract class. All scalar properties of an entity use DbPropertyEntry. The complex properties uses futher specialized version of this and it uses DbComplexPropertyEntry. DbReferenceEntry and DbCollectionEntry are used by reference and collection based navigation properties respectively. We can present the relationship between these types as follows:
using (var context = new InstituteEntities()) { var department = context.Departments.First<Department>(); department.DepartmentName = "Computer & Information Systems Engineering"; DbChangeTracker changeTracker = context.ChangeTracker; IEnumerable<DbEntityEntry<Department>> departmentEntries = changeTracker.Entries<Department>(); foreach (DbEntityEntry<Department> departmentEntry in departmentEntries) { DbPropertyEntry<Department, string> deparmentNameEntry = departmentEntry.Property<string>((d) => d.DepartmentName); string oldDepartmentName = deparmentNameEntry.OriginalValue; string modifiedDepartmentName = deparmentNameEntry.CurrentValue; EntityState state = departmentEntry.State; Console.WriteLine( string.Format("State: {0}, Old Value: {1}, New Value: {2}", state, oldDepartmentName, modifiedDepartmentName)); } }
DbEntityEntry Vs ObjectStateEntry
Before there were humans in this planet we call earth, there used to be a completely different world. There used to be a different creatures, some of them we call as Dinosaurs. But somehow they got destroyed and a new world was created. New species came into being. Even if we don't indulge into the debate of evolution Vs Intelligent design then there has been continuous evolution of human thoughts. We are learning about various hidden secrets of the universe and helping others learn. Similarly, before DbContext API, there used to be ObjectContext API. For some reason the all-knowing Creator of the framework [Microsoft] decided to get rid of the older, more complicated ObjectContext world and hence new DbContext API was born. This is definitely an intelligent design but the new species are also evolving and you see different versions like 4.2, 4.3 and so on. The problem is that they can co-exist. There has been a door to go back in time using IObjectContextAdapter to get ObjectContext through DbContext and here the fun begins. Now we can use the features of both the world in the same space and time. ObjectContext API used to have a similar feature like DbEntityEntry, it was ObjectStateEntry. Don't be surprised if you find them in older code or see them co-exist with new DbContext API. But don't use them in any new code after you have release version of tools supporting DbContext API available unless there is any legitimate reason.
Complex Type and Change Tracking
Like lazy loading, change tracking is also not supported for complex types. This is because the complex types are not real entities. They are syntactic sugar to group a number of fields. Each member of a complex type would be created as a column in the main entity. It has no primary key. The complex type based property is itself considered for change tracking. So if we assign a new instance to a complex type reference, it is tracked automatically. Let's make changes to CourseLocation and LocationAddress to support proxy creation. [http://msdn.microsoft.com/en-us/library/dd468057.aspx]
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; using System.ComponentModel.DataAnnotations; public class CourseLocation { public virtual int CourseLocationId { get; set; } public virtual string LocationName { get; set; } public virtual LocationAddress Address { get; set; } public virtual ICollection<Course> CoursesOffered { get; set; } } public class LocationAddress { public virtual string StreetAddress { get; set; } public virtual string Apartment { get; set; } public virtual string City { get; set; } public virtual string StateProvince { get; set; } public virtual string ZipCode { get; set; } } }Let's enable proxy creation and automatic change tracking by DbContext as follows:
public InstituteEntities() { this.Configuration.ProxyCreationEnabled = true; this.Configuration.AutoDetectChangesEnabled = true; }
Let's make some changes in the properties of a complex type for an entity and see how this affects the state of the entity itself. We can not check the state of a complex type directly using DbContext.Entries
using (var context = new InstituteEntities()) { var courseLocations = context.Set<CourseLocation>(); foreach (var location in courseLocations) { Console.WriteLine("LocationName: {0}, City: {1}", location.LocationName, location.Address.City); location.Address.City = string.Format("Great City of {0}", location.Address.City); Console.WriteLine("Updated LocationName: {0}, City: {1}", location.LocationName, location.Address.City); } //Complex types not directly tracked var trackedLocationAddressEntities = context.ChangeTracker.Entries<LocationAddress>(); Console.WriteLine("# Tracked LocationAddress: {0}", trackedLocationAddressEntities.Count()); var trackedCourseLocations = context.ChangeTracker.Entries<CourseLocation>(); Console.WriteLine("# Tracked CourseLocations: {0}", trackedCourseLocations.Count()); string courseLocationEntityStates = string.Join(",", trackedCourseLocations.Select(l => l.State) .Select<EntityState,string>(n => Enum.GetName(typeof(EntityState), n)) .ToArray<string>()); Console.WriteLine("Tracked CourseLocation States: {0}", courseLocationEntityStates); }This would result in the following output:
Interesting...But isn't that contrary to what we just discussed? i.e. there is no automatic change tracking for complex type. Yes, that is also true. But how would we justify this output with this statement. Let me explain. Basically, the state update that we see for CourseLocation entity after updating the values of LocationAddress is not because of automatic change tracking. It is because of using DbContext.ChangeTracker.Entries to get the entity's states. It also has an implicit call to DbContext.DetectChanges() as we discussed above. So, snapshot change has been pushed which has resulted in this behavior. Let's verify this with ObjectContext API to see if that is the case. We update the same method as follows:
private static void TestMethod5() { using (var context = new InstituteEntities()) { var courseLocations = context.Set<CourseLocation>(); foreach (var location in courseLocations) { Console.WriteLine("LocationName: {0}, City: {1}", location.LocationName, location.Address.City); location.Address.City = string.Format("Great City of {0}", location.Address.City); Console.WriteLine("Updated LocationName: {0}, City: {1}", location.LocationName, location.Address.City); } Console.Write("Using ObjectContext API to get EntityState"); Console.WriteLine("*************************************"); foreach (var location in courseLocations) { var ocAdapter = ((IObjectContextAdapter)context).ObjectContext; var state = ocAdapter.ObjectStateManager.GetObjectStateEntry(location).State; Console.WriteLine("Location: {0}, City: {1}, EntityState: {2}", location.LocationName, location.Address.City, state); } Console.WriteLine("*************************************"); Console.Write("Using DbContext API to get EntityState"); Console.WriteLine("*************************************"); //Complex types not directly tracked var trackedLocationAddressEntities = context.ChangeTracker.Entries<LocationAddress>(); Console.WriteLine("# Tracked LocationAddress: {0}", trackedLocationAddressEntities.Count()); var trackedCourseLocations = context.ChangeTracker.Entries<CourseLocation>(); Console.WriteLine("# Tracked CourseLocations: {0}", trackedCourseLocations.Count()); string courseLocationEntityStates = string.Join(",", trackedCourseLocations.Select(l => l.State) .Select<EntityState,string>(n => Enum.GetName(typeof(EntityState), n)) .ToArray<string>()); Console.WriteLine("Tracked CourseLocation States: {0}", courseLocationEntityStates); } }Now you can verify that when we get the states using ObjectContext API, it shows them as UnChanged. This is because ObjectContext API does not implicitly call DetectChanges(). But when we use DbContext API, it shows as Modified, which proves our point.
We can also use ComplexProperty method of DbEntityEntry to get the changes in complex property.
Address currentValues = context.Entry(user) .ComplexProperty(u => u.Address) .CurrentValue;
Download Code:
Wednesday, February 22, 2012
Entity Framework Code First - Relationship between Entities III [Loading Related Entities]
This post is amongst the series of our discussions about Entity Framework Code First. As an example, we have been building on top of Institute Entities. There are two things which you might have noticed in the example that we have been following. Let me bring them up:
Why different data loading mechanisms for related entities?
First of all, we need to understand why we need these different mechanisms for loading data. The reason is that we want optimized database access in order to avoid unnecessary number of round trips and the amount of data loaded. We must remember that database resides in an outside system with network in between. It is an I/O operation which is costly. Entity framework must translate the relevant operations on entities to SQL statements. This feature is to guide the framework to generate more efficient SQLs.
It is to suggest to Entity framework about related entities whenever an entity is loaded. If A is an entity which has an EntityReference or an EntityCollection of another entity B. This feature allows to direct EF API if the related B's data should be loaded when A is loaded. If we need to just load A's data and loading B's data only when it is accessed first time then we need Lazy loading. If we know that we would always be needing the related B's data when A is loaded then we need Eager loading. On the other hand if we don't want to the related B's when A is loaded and request it separately if required independently or for related A's then it comes under Explicit Loading. Since EF Code First is mostly convention based, we need our POCO entities to identify these options using some conventions.
Types of Loading
Based on the above discussion it is apparent that rhere are three ways that relationship entities can load when one of the source entity is loaded. They are as follows:
No Implicit Eager Loading
It must be remembered that that turning off lazy loading does not mean an implicit eager loading. There is nothing like implicit eager loading of entities in Entity Framework except for complex types. This is because all entities might be related to each other in some way or the other [Foreign key relationships in Databases]. If EF allows implicit eager loading then loading one entity would mean loading the whole database which might be very costly. Since this is implicit so the developers would have no idea what is going on behind the scenes unless SQL profiler is hooked up.
Lazy loading & Better Performance
The contrary is also not true i.e. lazy loading might or might not result in better performance. If we don't need data of related entities then the lazy loading would save us from querying additional data from the database. But if we need the navigational data then it means that there would be additional queries to the database when the navigational property is accessed. This might be unnecessary.
Proxy Is Entity Decorator
These types of loading is enabled in Entity Framework by introducing Proxies. Proxy is the implementation of Decorator Design pattern to provide additional capabilities to user defined entities. Both proxy and entity are not of certain type of abstraction but proxy does extend the entity. It inherits from the same POCO entity and creates a field for the POCO type it is decorating. Then it decorates it by adding some features. One of such feature is the loading behavior of related entities. Since Entity Framework is design with the idea of convention over configuration. Defining our POCO entities in a certain way is basically communicating to Entity framework about how we expect the property behavior in the proxy.
In the case of Student, since we have defined CourseStudents and CourseOfferedAt navigation properties as virtual. This would result in the behavior that these entities are not loaded when related Student entity is loaded. If we just load a Student entity like this, then the related Course(s) and CourseLocation(s) would not be part of the query. EF loads them lazily when required.
As a consumer of the Object Relational Mapping [ORM] tool we don't want our entities to be adulterated by inheriting from the types provided by Entity Framework. It used to be EntityObject in earlier version of Entity Framework. We also don't want to decorate it with certain attributes which makes us bind to a particular ORM tool. Entity framework still supports it by reusing some attributes provided by Data annotations library and some custom attributes developed for Entity Framework team at Microsoft. In order to understand various loading options of related entities we need to understand the concept of proxies in Entity Framework. We need to understand how it adds certain features of our entities without modifying them. The various features of proxies includes lazy loading and change tracking of entity's data. This makes it very clear that lazy loading is just a decoration of our entities provided by adding a proxy, so if we disable creation of proxy, no lazy loading is available. This is exactly what happens.
http://msdn.microsoft.com/en-us/library/gg715126%28v=vs.103%29.aspx
The proxies created by entity framework for our entities is certainly the decoration of these entities. This adds the feature of lazy loading and change tracking to the entity. This is added without modifying the entity which is based on OPEN / CLOSE principle. The proxy can be used wherever the original entity is expected. This is based on LISKOV Substitution principle. Remember that Entity framework is smart enough to recognize the need of proxy creation. If the entity is defined in such a way that creating it would not add any value i.e. change tracking or lazy loading then EF would not create a proxy and work with the entity type itself. Proxy creation can also be disabled by DbContext which would also result in use of entity types directly.
Choice of Loading option and its Consequences
Generally the settings for lazy loading is specified in the constructor of sub-type of DbContext. In the following example, we are turning on the lazy loading for all entities. This would be the default behavior of entities which they can override.
Now individual entities can override this behavior for their navigational properties as follows:
Disabling lazy loading will not support deferred loading of entities when they are needed using related entities. Now the only options remaining are explicit and eager loading. Both of them are specified through code.
Eager Loading
How to make a decision?
The decision between the pattern for loading related entities should be a deliberate one. This is because it would impact the overall performance of the system. Consider the difference as to be the two different concepts of locality of references. As we know there are two ideas for locality of reference, they are:
Entity Definition
Let's see what changes in Student entity would be needed to support lazy loading. First we have changed the access modifier to public. Additionally we have three related entities of Student. They are StudentCourse, VehicleDetails and StudentDepartment of type Course, Vehicle and Department respectively. Here VehicleDetails and StudentDepartment are defined as virtual to support lazy loading.
Let's see how LocationAddress is defined. Since this is a ComplexType, it does not support lazy loading.
Explicit Loading
Explicit loading would result in fetching the required entity's data from database if required. We can use Load method on an EntityReference or EntityCollection. ObjectContext's Load property can also be used for the same purpose.
We can use an iterator to go through the items. It is defined as follows:
Lazy Loading Students
In the following example, we are getting any arbitrary department returned. We are then querying for all the students in the department and printing their details.
In Entity Framework Code First, Eager Loading is supported by using Include. It, not only supports, loading the direct relationship entities but we can traverse the hierarchy of relationship. See how we are loading the vehicle details of all the students for all departments. We can also ensure type safety by using expressions. Here we are loading Students data using that.
It must be remembered that complex types are always loaded in eager fashion. We don't need to use an Include for them as this is the default behavior of EF Code First. In the above example LocationAddress is a complex type. It is defined as complex type based on convention. We don't need to use complex type attribute if we are following the conventions.
All of these methods can be called in the Main method as follows:
You can see the definition of InstituteDatabaseInitializer in the attached code as this would not add any value to this discussion. When we run this we get the following output:
Download Code
- Using virtual with associated entities in an entity's definition, e.g. for Student entity:
- Using Include in the iterator for loading courses as follows:
public virtual ICollectionStudentCourses { get; set; } public virtual Vehicle VehicleDetails { get; set; } public virtual Department StudentDepartment { get; set; }
foreach (var course in instituteEntites.Courses .Include("CourseStudents") .Include("CourseStudents.VehicleDetails") .Include("CourseStudents.StudentDepartment") .Include("CourseOfferedAt"))
Why different data loading mechanisms for related entities?
First of all, we need to understand why we need these different mechanisms for loading data. The reason is that we want optimized database access in order to avoid unnecessary number of round trips and the amount of data loaded. We must remember that database resides in an outside system with network in between. It is an I/O operation which is costly. Entity framework must translate the relevant operations on entities to SQL statements. This feature is to guide the framework to generate more efficient SQLs.
It is to suggest to Entity framework about related entities whenever an entity is loaded. If A is an entity which has an EntityReference or an EntityCollection of another entity B. This feature allows to direct EF API if the related B's data should be loaded when A is loaded. If we need to just load A's data and loading B's data only when it is accessed first time then we need Lazy loading. If we know that we would always be needing the related B's data when A is loaded then we need Eager loading. On the other hand if we don't want to the related B's when A is loaded and request it separately if required independently or for related A's then it comes under Explicit Loading. Since EF Code First is mostly convention based, we need our POCO entities to identify these options using some conventions.
Types of Loading
Based on the above discussion it is apparent that rhere are three ways that relationship entities can load when one of the source entity is loaded. They are as follows:
- Explicit
- Lazy loading
- Eager loading
No Implicit Eager Loading
It must be remembered that that turning off lazy loading does not mean an implicit eager loading. There is nothing like implicit eager loading of entities in Entity Framework except for complex types. This is because all entities might be related to each other in some way or the other [Foreign key relationships in Databases]. If EF allows implicit eager loading then loading one entity would mean loading the whole database which might be very costly. Since this is implicit so the developers would have no idea what is going on behind the scenes unless SQL profiler is hooked up.
Lazy loading & Better Performance
The contrary is also not true i.e. lazy loading might or might not result in better performance. If we don't need data of related entities then the lazy loading would save us from querying additional data from the database. But if we need the navigational data then it means that there would be additional queries to the database when the navigational property is accessed. This might be unnecessary.
Proxy Is Entity Decorator
These types of loading is enabled in Entity Framework by introducing Proxies. Proxy is the implementation of Decorator Design pattern to provide additional capabilities to user defined entities. Both proxy and entity are not of certain type of abstraction but proxy does extend the entity. It inherits from the same POCO entity and creates a field for the POCO type it is decorating. Then it decorates it by adding some features. One of such feature is the loading behavior of related entities. Since Entity Framework is design with the idea of convention over configuration. Defining our POCO entities in a certain way is basically communicating to Entity framework about how we expect the property behavior in the proxy.
A public non-sealed navigation property defined as virtual in POCO entity is telling the Entity framework to enable lazy loading for the specified relationship entity. The relationship entity is not loaded when the main entity is loaded. |
In the case of Student, since we have defined CourseStudents and CourseOfferedAt navigation properties as virtual. This would result in the behavior that these entities are not loaded when related Student entity is loaded. If we just load a Student entity like this, then the related Course(s) and CourseLocation(s) would not be part of the query. EF loads them lazily when required.
As a consumer of the Object Relational Mapping [ORM] tool we don't want our entities to be adulterated by inheriting from the types provided by Entity Framework. It used to be EntityObject in earlier version of Entity Framework. We also don't want to decorate it with certain attributes which makes us bind to a particular ORM tool. Entity framework still supports it by reusing some attributes provided by Data annotations library and some custom attributes developed for Entity Framework team at Microsoft. In order to understand various loading options of related entities we need to understand the concept of proxies in Entity Framework. We need to understand how it adds certain features of our entities without modifying them. The various features of proxies includes lazy loading and change tracking of entity's data. This makes it very clear that lazy loading is just a decoration of our entities provided by adding a proxy, so if we disable creation of proxy, no lazy loading is available. This is exactly what happens.
this.Configuration.LazyLoadingEnabled = false;Additionally, the POCO classes must be designed such that they support proxy creation and lazy loading.The requirements include POCO entity must be defined as public and navigation property should support late binding by being virtual. The details can be found here.
http://msdn.microsoft.com/en-us/library/gg715126%28v=vs.103%29.aspx
The proxies created by entity framework for our entities is certainly the decoration of these entities. This adds the feature of lazy loading and change tracking to the entity. This is added without modifying the entity which is based on OPEN / CLOSE principle. The proxy can be used wherever the original entity is expected. This is based on LISKOV Substitution principle. Remember that Entity framework is smart enough to recognize the need of proxy creation. If the entity is defined in such a way that creating it would not add any value i.e. change tracking or lazy loading then EF would not create a proxy and work with the entity type itself. Proxy creation can also be disabled by DbContext which would also result in use of entity types directly.
Choice of Loading option and its Consequences
Generally the settings for lazy loading is specified in the constructor of sub-type of DbContext. In the following example, we are turning on the lazy loading for all entities. This would be the default behavior of entities which they can override.
class InstituteEntities : DbContext { //... public InstituteEntities() {
this.Configuration.LazyLoadingEnabled = true;
} //... }
Now individual entities can override this behavior for their navigational properties as follows:
- Defining a navigational property as virtual would cause lazy loading of the navigational entity.
- Defining a navigational property as non-virtual would not cause lazy loading. Now this may be loaded explicitly by using Load method on EntityCollection / EntityReference, or using Include method and specifying query path.
Disabling lazy loading will not support deferred loading of entities when they are needed using related entities. Now the only options remaining are explicit and eager loading. Both of them are specified through code.
Eager Loading
How to make a decision?
The decision between the pattern for loading related entities should be a deliberate one. This is because it would impact the overall performance of the system. Consider the difference as to be the two different concepts of locality of references. As we know there are two ideas for locality of reference, they are:
- Temporal Locality
- Spatial Locality
Entity Definition
Let's see what changes in Student entity would be needed to support lazy loading. First we have changed the access modifier to public. Additionally we have three related entities of Student. They are StudentCourse, VehicleDetails and StudentDepartment of type Course, Vehicle and Department respectively. Here VehicleDetails and StudentDepartment are defined as virtual to support lazy loading.
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; public class Student { public int StudentId { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public int GradePointAverage { get; set; } public bool IsOutStanding { get; set; } //no lazy loading public ICollection<Course> StudentCourses { get; set; } //support lazy loading public virtual Vehicle VehicleDetails { get; set; } public virtual Department StudentDepartment { get; set; } } }Similarly, Course entity is also updated as follows:
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; public class Course { public int CourseId { get; set; } public string CourseName { get; set; } //no lazy loading public ICollection<Student> CourseStudents { get; set; } //support lazy loading public virtual ICollection<CourseLocation> CourseOfferedAt { get; set; } } }
Let's see how LocationAddress is defined. Since this is a ComplexType, it does not support lazy loading.
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; using System.ComponentModel.DataAnnotations; public class CourseLocation { public int CourseLocationId { get; set; } public string LocationName { get; set; } public LocationAddress Address { get; set; } public virtual ICollection<Course> CoursesOffered { get; set; } } public class LocationAddress { public string StreetAddress { get; set; } public string Apartment { get; set; } public string City { get; set; } public string StateProvince { get; set; } public string ZipCode { get; set; } } }
Explicit Loading
Explicit loading would result in fetching the required entity's data from database if required. We can use Load method on an EntityReference or EntityCollection. ObjectContext's Load property can also be used for the same purpose.
private static void PrintStudentNames() { using (var context = new InstituteEntities()) { //explicit loading of all students context.Students.Load(); foreach (var student in context.Students) { Console.WriteLine("Student Name: {0}", student.FirstName); } } }Let's see another example of Explicit loading. Here we are doing conditional explicit loading of students belonging only to a particular department.
private static void PrintDepartmentAndStudentsDetails() { using (var context = new InstituteEntities()) { //Explicit loading with query foreach (var department in context.Departments) { Console.WriteLine(string.Format("Department Name: {0}", department.DepartmentName)); context.Entry<Department>(department) .Collection(d => d.Students) .Query() .Where(s => s.StudentDepartment.DepartmentId == department.DepartmentId) .Load(); foreach (var student in department.Students) { Console.WriteLine(string.Format(" Student: {0}", student.FirstName)); } } } }In the following example, we are using Linq to Entities to query Department and its related Students entities. Here stds is of type DbQuery.
private static void PrintStudentAndTheirDepartments() { using (var context = new InstituteEntities()) { //Loading using Linq to Entities var stds = from d in context.Departments from st in d.Students select new { d.DepartmentName, st.FirstName }; foreach (var s in stds) { Console.WriteLine( string.Format("Department: {0}, Student: {1}", s.DepartmentName, s.FirstName)); } } }Here DbQuery is the same class as the parent type for DbSet used with DbContext.
We can use an iterator to go through the items. It is defined as follows:
Lazy Loading Students
In the following example, we are getting any arbitrary department returned. We are then querying for all the students in the department and printing their details.
private static void LazyPrintStudentNames() { using (var context = new InstituteEntities()) { var firstDepartment = context.Departments.First<Department>(); //lazy loading of all students foreach (var student in firstDepartment.Students) { Console.WriteLine("Student Name: {0}", student.FirstName); } } }Eager Loading
In Entity Framework Code First, Eager Loading is supported by using Include. It, not only supports, loading the direct relationship entities but we can traverse the hierarchy of relationship. See how we are loading the vehicle details of all the students for all departments. We can also ensure type safety by using expressions. Here we are loading Students data using that.
private static void PrintInstituteDetails() { using (var context = new InstituteEntities()) { //Eager loading of Vehicle and Studentsinformation foreach (var department in context.Departments .Include(d => d.Students) .Include("Students.VehicleDetails")) { Console.WriteLine( string.Format("Department Name: {0}", department.DepartmentName)); foreach (var student in department.Students) { Console.WriteLine( string.Format("***Student: {0}", student.FirstName)); if (student.VehicleDetails != null) { //Explicit loading using Load context.Entry<Student>(student) .Collection<Course>(s => s.StudentCourses) .Load(); foreach (var c in student.StudentCourses) { Console.WriteLine("*********Course Name: {0}", c.CourseName); //loading navigation property to a complex type Console.WriteLine( string.Format("Also offered at city: {0}", c.CourseOfferedAt.First<CourseLocation>().Address.City)); } } } } } }
It must be remembered that complex types are always loaded in eager fashion. We don't need to use an Include for them as this is the default behavior of EF Code First. In the above example LocationAddress is a complex type. It is defined as complex type based on convention. We don't need to use complex type attribute if we are following the conventions.
public static void Main(string[] args) { Database.SetInitializer<InstituteEntities>(new InstituteDatabaseInitializer()); //Lazy loading LazyPrintStudentNames(); //Linq to Entities PrintStudentAndTheirDepartments(); //Explicit Loading PrintStudentNames(); //Explicit loading with query PrintDepartmentAndStudentsDetails(); //Explicit, Eager and Lazy Loading PrintInstituteDetails(); Console.ReadLine(); }
All of these methods can be called in the Main method as follows:
You can see the definition of InstituteDatabaseInitializer in the attached code as this would not add any value to this discussion. When we run this we get the following output:
Download Code
Labels:
.net,
.net 4.0,
.net 4.5,
.net framework,
C#,
entity framework,
entity framework code first
Sunday, February 12, 2012
Entity Framework Code First - Relationship between Entities II
This is part of the series of discussions about entity framework. In the last post, we touched the basics about relationship between entities. We discussed how the relational database concepts of different relationship types can be mapped to object oriented entities. We are progressively working on an Institute scenario discussing the various entity framework features.
In this post we will be adding on to the previous post by discussing the relationship between entities further. We will discuss how we can define one-to-many relationships. In the previous post we defined a relationship between Departments and Students entities. This relationship was defined such that a student might not belong to a department. If we look at the database table created, we see that there is a foreign key added in the Students table. It is allowing null values so a student can exist without belonging to any department.
In this post we will be adding on to the previous post by discussing the relationship between entities further. We will discuss how we can define one-to-many relationships. In the previous post we defined a relationship between Departments and Students entities. This relationship was defined such that a student might not belong to a department. If we look at the database table created, we see that there is a foreign key added in the Students table. It is allowing null values so a student can exist without belonging to any department.
In order to change the relationship to one-to-many, we need to specify it when the model is being built. We can do it by overriding OnModelCreating method of InstituteEntities (DbContext). Here we are specifying that it is mandatory for a student to belong to a department. Entity framework would not allow a student to be saved to the database without belonging to a department.
protected override void OnModelCreating(DbModelBuilder modelBuilder) { base.OnModelCreating(modelBuilder); modelBuilder.Entity<Student>().HasRequired(s => s.StudentDepartment); }This would result in generating the Students table as follows:
So the similar Id has been generated to hold foreign key for department but it is now specified as not null which changes the relationship to one-to-many from previous zero-to-many relationship. Basically EF Code First uses conventions for keys, either primary or foreign keys. Just adding a non-nullable primitive key for holding Department table's foreign key would update the table definition to have a non-nullable foreign key. If we keep the name as DepartmentId then based on convention based approach, the run-time would realize that this is to hold foreign key of Department and would make it non-nullable. Let us update Student's entity as follows:
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; class Student { public int StudentId { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public int GradePointAverage { get; set; } public bool IsOutStanding { get; set; } //Foreign key of Department public int DepartmentId { get; set; } public virtual ICollection<Course> StudentCourses { get; set; } public virtual Vehicle VehicleDetails { get; set; } public virtual Department StudentDepartment { get; set; } } }Just updating the above code would result in the updated Student's table in database as follows:
Please note that the name of foreign key is updated to be DepartmentId. It has also been updated as non-null. This would stop us from adding student without specifying department information. If we had needed the foreign key to be picked up based on convention but we still needed to keep the relationship as zero-to-many then we could just use a nullable type and framework would take care of that. Let's update DepartmentId as nullable int [int?] in Student table as follows:
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; class Student { public int StudentId { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public int GradePointAverage { get; set; } public bool IsOutStanding { get; set; } //Foreign key of Department public int? DepartmentId { get; set; } public virtual ICollection<Course> StudentCourses { get; set; } public virtual Vehicle VehicleDetails { get; set; } public virtual Department StudentDepartment { get; set; } } }This would result in the following Student's table in the database.
If we had used a name for Department's Id which is not based on convention then the framework obviously would not recognize the significance and would treat that field as just entity's property and would just create a column for the field in the database table. Let us update DepartmentId as DepartmentInfoId.
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; class Student { public int StudentId { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public int GradePointAverage { get; set; } public bool IsOutStanding { get; set; } //Foreign key of Department public int DepartmentInfoId { get; set; } public virtual ICollection<Course> StudentCourses { get; set; } public virtual Vehicle VehicleDetails { get; set; } public virtual Department StudentDepartment { get; set; } } }Now run the application. You can notice that the Student's table is created as follows:
Since the new key is not based on EF convention, the framework has not recognized it. In order to help the framework, we can specify this in OnModelCreating method of the DbContext as follows:
protected override void OnModelCreating(DbModelBuilder modelBuilder) { base.OnModelCreating(modelBuilder); modelBuilder.Entity<Student>().HasRequired(s => s.StudentDepartment) .WithMany(d => d.Students) .HasForeignKey(s => s.DepartmentInfoId); }Now the framework would realize that the DepartmentInfoId should be used to define foreign key relationship between Student and Department.
If we had needed a zero-to-many relationship, we would need to update DepartmentInfoId to allow null values.
//Foreign key of Department public int? DepartmentInfoId { get; set; }We also need to update OnModelCreating method in InstituteEntities as follows:
protected override void OnModelCreating(DbModelBuilder modelBuilder) { base.OnModelCreating(modelBuilder); modelBuilder.Entity<Student>().HasOptional(s => s.StudentDepartment) .WithMany(d => d.Students) .HasForeignKey(s => s.DepartmentInfoId); }The framework would still use DepartmentInfoId as Foreign key for Department table in Student table. It would just allow null for this allowing zero-to-many relationship.
Please remember that this would only be possible if we define DepartmentInfoId as nullable [int?]. If we just change it as follows then model validation would fail. Let us do that and change Student POCO as follows:
//Foreign key of Department public int DepartmentInfoId { get; set; }Now we run the application and see it crashing.
Download Code
Labels:
.net,
.net framework,
C#,
entity framework,
entity framework code first
Saturday, February 11, 2012
Entity Framework Code First - Relationship between Entities
This is the second post of our discussion about Entity Framework Code First. In this post we will be discussing how we can create relationship between two entities. Keeping to the code first ideology, code first entities would still be based on simple POCO classes. The relationship between entities are used by entity framework libraries to build the model and hence generate database if required.
As an example, we would be developing on top of our previous example. We have simply discussed how to create a simple application with Entity Framework Code First with one simple entity. Let's now expand this example by introducing courses in our institute. Students are assigned to these courses. Any number of students can be assigned to a course. These courses are offered at different campuses of the institute. The locations have physical presence and have a specific name. Since institute's parking space is limited so each student is allowed to register one vehicle. Each student belongs to a certain department.
Based on the above requirements, we can create following entities in our system.
- Student
- Vehicle
- Course
- CourseLocation
- Department
- One-to-One: Since a Student is allowed to register only one vehicle, there is one-to-one relationship between the two.
- One-to-Many: A Department can enroll any number of students but a student only belongs to a single department. In this post, we will keep it simple by defining a zero-to-many relationship but we will see how we can fix it in the next post.
- Many-to-Many: Any number of students can be enrolled in a course. Similarly, a student can be enrolled in multiple courses. Likewise, a course can be offered at multiple locations and a location can host multiple courses.
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; class Student { public int StudentId { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public int GradePointAverage { get; set; } public bool IsOutStanding { get; set; } public virtual ICollection<Course> StudentCourses { get; set; } public virtual Vehicle VehicleDetails { get; set; } public virtual Department StudentDepartment { get; set; } } }Vehicle
namespace EFCodeFirstDatabaseCreation.Entities { class Vehicle { public int VehicleId { get; set; } public string Year { get; set; } public string Make { get; set; } public string Model { get; set; } } }Course
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; class Course { public int CourseId { get; set; } public string CourseName { get; set; } public ICollection<Student> CourseStudents { get; set; } public virtual ICollection<CourseLocation> CourseOfferedAt { get; set; } } }Department
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; class Department { public int DepartmentId { get; set; } public string DepartmentName { get; set; } public virtual ICollection<Student> Students { get; set; } } }CourseLocation
namespace EFCodeFirstDatabaseCreation.Entities { using System.Collections.Generic; using System.ComponentModel.DataAnnotations; class CourseLocation { public int CourseLocationId { get; set; } public string LocationName { get; set; } public virtual LocationAddress Address { get; set; } public virtual ICollection<Course> CoursesOffered { get; set; } } [ComplexType] class LocationAddress { public string StreetAddress { get; set; } public string Apartment { get; set; } public string City { get; set; } public string StateProvince { get; set; } public string ZipCode { get; set; } } }Defining InstituteEntities [DbContext] Now we need to define the DbContext to query these entities from the database. As you can see we don't need to add all the entities details in the DbContext. We only need to add those entities which are to be accessed directly using the DbContext instance. The other entities which have a relationship with these entities would still have a corresponding table in the database. It is just that they would be accessed to their corresponding relationship tables starting from the DbContext type.
namespace EFCodeFirstDatabaseCreation.Entities { using System.Data.Entity; class InstituteEntities : DbContext { public DbSet<Student> Students { get; set; } public DbSet<Course> Courses { get; set; } public DbSet<Department> Departments { get; set; } } }Initializing Entities with Data Let us initialize these entities with some data. Remember that this would result in actual data inserted in the tables mapped to these entities by Entity Framework.
namespace EFCodeFirstDatabaseCreation { using EFCodeFirstDatabaseCreation.Entities; using System.Data.Entity; using System.Collections.Generic; class InstituteDatabaseInitializer : DropCreateDatabaseAlways<InstituteEntities> { protected override void Seed(InstituteEntities context) { Student student1 = new Student() { FirstName = "Muhammad", LastName = "Siddiqi", IsOutStanding = true, GradePointAverage = 3 }; Vehicle vehicle = new Vehicle() { VehicleId = 1, Make = "Bugatti", Model = "", Year = "2012" }; Department department1 = new Department() { DepartmentId = 1, DepartmentName = "Computer Systems Engineering", Students = new List<Student>() }; Department department2 = new Department() { DepartmentId = 2, DepartmentName = "Electrical Engineering", Students = new List<Student>() }; Student student2 = new Student() { FirstName = "Chattan", LastName = "Shah", IsOutStanding = true, GradePointAverage = 4, VehicleDetails = vehicle }; Student student3 = new Student() { FirstName = "Imran", LastName = "Ashraf", IsOutStanding = true, GradePointAverage = 4 }; Student student4 = new Student() { FirstName = "Jawad", LastName = "Qureshi", IsOutStanding = true, GradePointAverage = 4 }; CourseLocation courseLocation1 = new CourseLocation() { LocationName = "Karachi Campus", Address = new LocationAddress() { StreetAddress = "XYZ I.I. Chundrigar Road", City = "Karachi", StateProvince = "Sindh", ZipCode = "YYYYY" } }; CourseLocation courseLocation2 = new CourseLocation() { LocationName = "Manhattan Campus", Address = new LocationAddress() { StreetAddress = "6th Street", City = "New York City", StateProvince = "New York", ZipCode = "ZZZZZ" } }; Course course1 = new Course() { CourseName = "Engineering Mechanics", CourseStudents = new List<Student> { student1, student2 }, CourseOfferedAt = new List<CourseLocation> {courseLocation1, courseLocation2 } }; Course course2 = new Course() { CourseName = "Fault Tolerence & Reliable System Design", CourseStudents = new List<Student> { student3, student4 }, CourseOfferedAt = new List<CourseLocation>{courseLocation1} }; Student student5 = new Student() { FirstName = "Ali", LastName = "Khan", GradePointAverage = 3, IsOutStanding = true }; course1.CourseStudents.Add(student5); department1.Students.Add(student1); department1.Students.Add(student2); department1.Students.Add(student3); department2.Students.Add(student4); department2.Students.Add(student5); context.Courses.Add(course1); context.Courses.Add(course2); context.Departments.Add(department1); context.Departments.Add(department2); base.Seed(context); } } }Test Code: Now let us write some test code to see if the entities are generated as expected. The code below would also test if the data has been inserted in those entities. Here we are setting database initializer for InstituteEntities. Then we are just writing the data to the console testing if the entities are initialized properly.
namespace EFCodeFirstDatabaseCreation { using System.Linq; using EFCodeFirstDatabaseCreation.Entities; using System.Data.Entity; using System; class Program { static void Main(string[] args) { //Initialize Database entities Database.SetInitializer<InstituteEntities>(new InstituteDatabaseInitializer()); InstituteEntities instituteEntites = new InstituteEntities(); foreach (var course in instituteEntites.Courses .Include("CourseStudents") .Include("CourseStudents.VehicleDetails") .Include("CourseStudents.StudentDepartment") .Include("CourseOfferedAt")) { string location = course.CourseOfferedAt == null ? string.Empty : string.Join(", ", course.CourseOfferedAt.Select(l => l.LocationName).ToArray<string>()); Console.WriteLine("***************************"); Console.WriteLine("Course : {0}, Offered At: {1}", course.CourseName, location); foreach (var student in course.CourseStudents) { Console.WriteLine( string.Format("Student : {0} {1}, GPA : {2}", student.FirstName, student.LastName, student.GradePointAverage)); Console.WriteLine(" Department : {0}", student.StudentDepartment.DepartmentName); if (student.VehicleDetails != null) { Console.WriteLine(" Vehicle: {0} {1} {2}", student.VehicleDetails.Year, student.VehicleDetails.Make, student.VehicleDetails.Model); } } Console.WriteLine("***************************"); } Console.ReadLine(); } } }Database Relationship: You might be wondering how these relationships are created in the database. One-to-One & One-to-Many: These types of relationship result in adding the foreign key on one of these entities. In our case, the relationship between Vehicle and Student is one-to-one. A student is supposed to register only one vehicle in the system. On the other hand, the relationship between Department and Student is One-to-Many as a department could have more than student. Many-to-Many This is similar to generating Database from and Entity Relationship Diagram. This results in adding a relationship table in the Database. In our case the relationship between Student & Courses is many-to-many. This would result in generating a new table in database containing primary keys of both the entities. For our case, the name of this table is CourseStudents. Similarly, a course can be offered at multiple locations i.e. the relationship between Course and CourseLocation is also many-to-many. This results in creating CourseLocationCourses table in Database. Complex Type A complex type is represented in the database by adding all properties of complex type to the entity itself. By default, the names of these properties are preceded by the name of variant used for ComplexType instance in the entity. Output: When we run the project, it runs in the following output. Download Code:
Labels:
.net,
C#,
EF Code First,
entity framework
Subscribe to:
Posts (Atom)