Wednesday, July 31, 2013

Handling NHibernate Entity Without Session

- 0 comments
NHibernate provides the "session" as a persistence manager interface. The session is used for adding, retrieving and updating any information (viz. entity) in the database. Objects are persisted to the database with the help of the session. A session has many responsibilities, including maintaining the database, transaction and entities context information. In most cases, detached or transient entities can be attached to a session, but sometimes we are unable to determine the entity state, especially if the session it was originally instantiated by has been closed. For these instances, NHibernate sessions provides us with the merge method.

When merging an entity into a session here is what happens:
  • If the current session has an entity with the same ID, the changes made to this detached entity are copied to the persistent entity in the session, and it returns the persistent entity.
  • If the current session does not have any entity with the same ID, it loads the entity from the database. The changes are copied to the persistent entity, and it returns the persistent entity.
  • If the entity does not exist in the database, the current session creates a new persistence entity and copies data from the detached entity.

The merge always returns the same entity that was passed in but the entity now contains all the "merged" changes and is associated with the current session. When the current session transaction is committed, all the changes are saved in the database. The sample code demonstrates how merge can be used. (There is an assert statement to show how the merged entity is different from the passed entity.)

//
//
//
user.FirstName = "Brad";
user.LastName = "Henry";
user.Email = "bradhenry@nexportsolutions.com";

using(var session = SessionFactory.OpenSession())
{
     session.BeginTransaction();
     var mergedUser = session.Merge(user);
     session.Transaction.Commit();
     Assert.False(ReferenceEquals(user,mergedUser));
}
//
//
//

 public User CreateUser()
  {
 var user = new User
  {
   FirstName = "Robert",
   LastName = "Smith",
   Email = "robertsmith@nexportsolutions.com"
  }
 using (var session = SessionFactory.OpenSession())
 {
  session.BeginTransaction();
  session.Save(user);
  session.Transaction.Commit();
  session.Evict(user);
 }
 return user;
}

In conclusion, the session merge enables an entity to be persisted even if that entity is not associated with the current session. This gives the flexibility of processing an entity without an open session. The developer can get an entity with a session, free the session immediately and save the processed entity with another session. Thus, the session is short-lived.

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Thursday, July 25, 2013

Full-Text Search - Part 2

- 0 comments
In my previous post Full-Text Search - Part 1, I discussed our reporting solution using NHibernate Search and MSMQ. NHibernate Search had been dead in the water for months when our administrators began complaining that user search results were extremely slow. We were using NHibernate to do a joined query on two tables with a "like" condition on four of the columns. Needless to say, it was not the most efficient operation.

Our development team decided to revisit Full-Text Search and scrap NHibernate Search. Instead, we wanted to develop a service that we could communicate with over HTTP that would update Lucene documents. We were researching how this could be done when we found Apache Solr. It did exactly what we wanted to do with our proposed service but also had the benefit of a strong user base. We scrapped the idea of creating our own service and decided to go with the proven solution instead.

Because of our unfamiliarity with the product, we postponed the automation of Solr installation for a later release and installed manually. Solr was written in Java and needed a Java servlet in which to run. Apache Tomcat was the recommended servlet, so we installed that first by following the installation prompt. Solr installation simply required that we copy the solr.war file from their download package into the lib folder of the Tomcat directory. All of the other entity-specific work was more easily automated.

The remaining automation tasks were generating the Extensible Markup Language (XML) files for the entities we were indexing and copying over configuration files. The Tomcat context XML files were required to be able to communicate with Tomcat the location of the indexes on the disk. The schema XML file was required as a contract for Solr to be able to correctly construct the Lucene document and for us to be able to communicate changes to those documents.

To avoid processing our complex object graph, we decided not to tie the index processing to our existing entities. Instead, we created a separate entity that only had the four fields we cared about: First Name, Last Name, Login Name and Primary Email. We wrote our own SolrFieldAttribute class that included all the flags needed to create the Solr schema XML file for the entity. We also created a SolrClassAttribute to indicate the classes for which to create the schema XML file and to create the Tomcat context file.

Initially, we only stored the four fields that were going to be displayed in the result list and that we were going to use to filter the results. This allowed us to process the documents for these lightweight entities much more quickly. Unfortunately, our system also had to restrict the users an administrator could see based on the administrator's permission set. To address this problem, we used Solr's multi-valued property to index all of the organizations in which the user had a subscription. From the code's standpoint, we added another property to the SolrFieldAttribute that indicated to our schema generator that it needed to add multiValued="true" to the schema field.

To provide the best user results list, we also had to add a field to the entity called Full Name that was the First Name and Last Name separated by a space. This would allow a user to enter "Sally Sunshine," and she would be the top user in the results list.

To make the update calls to Solr, we used a background job scheduling service that we already had in place. When an object changed that needed to be indexed, we created an entry in a table with the object's id and date updated. The scheduling service job, called the indexing job, then pulled rows from the table, converted them to Solr JSON objects and sent HTTP calls to Solr with the JSON object. The first problem we encountered was that the JSON used by Solr was not standard JSON. We had to write our own converter instead of using a standard converter like Newtonsoft. Notice in the examples below that Solr JSON does not handle arrays in the same way as standard JSON.

Example of standard JSON:

{
     "id":"a017a879-a2f2-4e37-811d-8f13ace56819",
     "firstname":"Sally",
     "lastname":"Sunshine",
     "fullname":"Sally Sunshine",
     "primaryemail":"ssunshine@sunshine.net",
     "loginname":"ssunshine",
     "organizationid":
     [
          "c04eb269-3a81-442f-85b5-23b4bd87d262",
          "87466c69-1b8f-4895-9483-5c1cb55ecb2b"
     ]
}

Example of Solr JSON:

{
     "id":"a017a879-a2f2-4e37-811d-8f13ace56819",
     "firstname":"Sally",
     "lastname":"Sunshine",
     "fullname":"Sally Sunshine",
     "primaryemail":"ssunshine@sunshine.net",
     "loginname":"ssunshine",
     "organizationid":"c04eb269-3a81-442f-85b5-23b4bd87d262",
     "organizationid":"87466c69-1b8f-4895-9483-5c1cb55ecb2b"
}

After successfully writing our speciaized JSON converter, we ran into problems with filtering the results. The beauty of Full-Text search was that it could return close results instead of just exact matches, but we were not getting the results we expected. The results returned by Solr were sorted by the score of how well the result matched the query criteria. To get the best possible match, we had to add weights to each of the filters. Exact matches got the most weight, split term matches got the next highest weight and wildcard matches got the least weight.

To avoid problems with case sensitivity, we added a Lowercase property to the SolrFieldAttribute that indicated to Solr to automatically add a lowercase version of the indexed field in the document. It did this by having two versions of the field in the schema XML file in addition to a copy field attribute. We also added a Tokenized property to the SolrFieldAttribute that worked in similar way but broke on spaces rather than making the field lowercase. The generated schema fields can be seen below.


<fields>
 <field name="indexdate" indexed="true" type="date" stored="true" multiValued="false" default="NOW"/>
 <field name="id" indexed="true" type="uuid" stored="true" required="true"/>
 <field name="firstname" indexed="true" type="string" stored="true"/>
 <field name="firstname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="firstname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="lastname" indexed="true" type="string" stored="true"/>
 <field name="lastname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="lastname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="fullname" indexed="true" type="string" stored="true"/>
 <field name="fullname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="fullname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="primaryemail" indexed="true" type="string" stored="true"/>
 <field name="primaryemail.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="primaryemail.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="loginname" indexed="true" type="string" stored="true"/>
 <field name="loginname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="loginname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="organizationid" indexed="true" type="uuid" stored="false" multiValued="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<copyField dest="firstname.lowercase" source="firstname"/>
<copyField dest="firstname.tokenized" source="firstname"/>
<copyField dest="lastname.lowercase" source="lastname"/>
<copyField dest="lastname.tokenized" source="lastname"/>
<copyField dest="fullname.lowercase" source="fullname"/>
<copyField dest="fullname.tokenized" source="fullname"/>
<copyField dest="primaryemail.lowercase" source="primaryemail"/>
<copyField dest="primaryemail.tokenized" source="primaryemail"/>
<copyField dest="loginname.lowercase" source="loginname"/>
<copyField dest="loginname.tokenized" source="loginname"/>

The initial indexing of entities in the database took some time. We wrote a separate job that went through the Users table and added rows to the update table that the indexing job was monitoring. After the initial indexing, each NHibernate update for a user inserted a row into the update table. Because the indexing job processed hundreds of entities at a time, the user documents closely mirrored the changes to the database.

Our administrators were very happy with the speed of our new Full-Text user management search. After performing a search, the administrator could then click the user's name in the list to be taken to the user's full profile. This increase in efficiency caused administrators to begin to wonder where else this magical solution could be applied.

In Part 3, we'll bring this series full-circle and discuss how Full-Text Search was finally able to make its way into our reporting system. Stay tuned!

About NexPort Solutions Group

NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Friday, July 19, 2013

Using Google Spreadsheet as a Burndown Chart

- 0 comments

Introduction

No one in our shop questions the value of having a highly visible burndown chart in order to track our daily progress, but as we move to a more distributed team with developers working remotely or from home, we are presented with some challenges in both updating and making the burndown chart available.

Discussion

Our teams will generally update the burndown charts daily. We simply add the number of remaining story points (for the current release) as recorded in our issue tracker and mark that number on our burndown chart. The chart was usually kept in the hallway or some other room where it was highly visible.

With distributed team members, we had to find a way to share a burndown chart easily. We tried several methods and finally decided to go with a Google spreadsheet. The spreadsheet allows us to display a chart for our burndown, update it daily and share it with management or testing.

Sample of the Completed Burndown

Step 1: Create a Google Spreadsheet

You'll need to start by creating a new Google spreadsheet with two sheets. The first sheet should be named chart and the second sheet named data.

Detail of the 2 sheets

Step 2: Setup the Data Columns

Select the DATA sheet, and add the columns that you will need for your chart; Sprint, Date, Actual, Projected, Ideal. Sprint is used for reference and does not display in the burndown chart.


 Setup the Chart. Switch to the CHART sheet and from the Insert menu, select the CHART option.

In the chart edit dialog, set the data range to "DATA!B1:E100" and select line chart as the type. Select OK or update.



Updating your Chart Data 

Now you are ready to begin using your new burndown chart. We usually fill in the Ideal column with values that give us a straight path from our starting value all the way to zero. The Actual column is filled in every day with the actual number of REMAINING story points on that day. After you have completed a few iterations (or sprints), you can begin calculating a projected path based on your current velocity.


Conclusion

Using some of the sharing and publishing features built into Google Apps, you should be able to share read-only versions of your burndown chart. So far, we have had great luck with this method, and it has worked well for our entire team.

About NexPort Solutions Group

NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Monday, July 15, 2013

NHibernate Delayed-Deletes

- 0 comments

We strive to provide our users with a swift and responsive user interface. No one likes waiting around for their computer to do something, especially if user input is not necessary. One such case would be when a user asks for something to be deleted. There is no need to have the user wait for the delete to finish. This work can be done in the background.

In the interest of increasing performance on the front-end, we utilize a seperate delete queue rather than deleting entities in the user’s session. Deletions seem almost instantaneous to the user even if they take a while due to cascading deletes and the creation of numerous audit logs. Of course we don't want the deleted entities showing up in the user interface while the delete queue is doing it's work. So, we just need to filter the results so we do not present any entities which have been marked for deletion. What we are looking for is a way to perform a soft delete with the ability to find deleted entities when necessary; in the delete queue for instance.

We use Nhibernate for our Data Access Layer (DAL). So, the brute force approach to this problem would be to add a where clause to the end of each query. With the number of queries used by the system, this seems to be a rather poor plan. Also, collections and foreign key relations would not include the option to add a where clause. This approach would give us great performance, since we would only need to perform the "where" where needed. At the same time, there would be a good chance of missing a query.

The standard way to perform soft deletes in NHibernate seems to be to add a where clause to the class tag in the mappings. This makes it much more difficult to access data marked for deletion but not yet deleted. With the class mapping containing the where clause, no where will ever return a marked entity. One could add a subclass with the where clause, allowing access to the deleted data via the parent. Unfortunately, this also seemed to add unnecessary code and complexity.

So, in our research, we found another really nice feature of NHibernate. We discovered, while working on the "where" filters, NHibernate filters.  This feature “allows us to very easily create global where clauses that we can flip on and off at the touch of a switch.” Perfect! Just what we needed. Turn the filter on by default and then turn it off as needed by simply calling session.DisableFilter. So we added the filter to one object and enabled it in the OpenSession call.

The filter def:
<filter-def name="NorMarkedForDeletion">
    </filter-def>
The code to add to each class:
<class name="User" table="Users">
    ...
    <property name="MarkedForDeletion">
      
    <filter condition="MarkedForDeletion <> 1" 
        name="NotMarkedForDeletion">
</class>
Add the filter to the OpenSession call.
public static ISession OpenSession(string sessionName = "Nexport Session")
{
    var session = SessionFactory.OpenSession();
    session.FlushMode = FlushMode.Commit;

    if (AppSettingsKeys.Cluster.EnableNHibernateProfiler)
        HibernatingRhinos.Profiler.Appender.NHibernate.NHibernateProfiler.RenameSessionInProfiler(session, sessionName);

    session.EnableFilter("NotMarkedForDeletion");

    return session;
}
Nothing like an elegant solution. But oh, look, about a third of the unit tests fail, with rather strange and random error messages, when the test harness is run. Maybe this was not the best solution.

After the filters approach failed and an attempt at fixing the tests turns out to be not a quick fix, it is back to the drawing board. Adding a where clause to the class mapping seemed to work rather well. It just made it impossible for us to access the marked data afterwards, preventing us from actually deleting it from the system. But if, instead, we create two instances of the session factory, one with the where clause on the mappings, and the other without, we would be able to control the visibility of marked objects.  We are losing some of the flexibility we would have had with the filters, allowing us to switch at any point in the session, but we still have session by session control.

Just need to create two session factories:
    private static ISessionFactory SessionFactory { get; set; }
    private static ISessionFactory DeleteSessionFactory { get; set; }

    public static void Init()
    {
        ...

        DeleteSessionFactory = cfg.BuildSessionFactory();

        foreach (Type deleteMarkable in System.Reflection.Assembly.GetExecutingAssembly().GetTypes().Where(mytype => mytype.GetInterfaces().Contains(typeof(IDeleteMarkable))))
        {
            var classMapping = cfg.GetClassMapping(deleteMarkable);
            try
            {
                classMapping.Where = "MarkedForDeletion != 1";
            }
            catch (Exception ex)
            {
            }
        }

        SessionFactory = cfg.BuildSessionFactory();
    }

One thing to touch on is cascading. When marking an entities for deletion, we want to mark all children for deletion, too. However, at that point, just deleting all the child entities would be just as fast. As such, we mark some, but not all, of the children as deleted, as well. Some entities are impossible or at least almost impossible to reach without the parent entity.

In the end, creating two session factories worked rather well. A few tests failed at first, but they were quickly rectified by replacing the default session with one that can access marked entities in a few places, mostly in the audit logging triggered by a delete call on the object. It is still a mystery as to why the filters caused the issues they did. Now we have a system which filters out objects marked for deletion such that the user is not able to see them, while still giving us the flexibility of NHibernate when deleting these objects.  Furthermore, having other entities implement the deletable interface should now be even simpler. They need only to implement the interface and add the MarkedForDeletion property and column, and everything else will handled automatically.

These changes greatly increase the speed of deletions from the users' perspective. Rather than waiting for the entity and all of its associations to be deleted, a delete will now be nothing more then a few SQL update calls with everything else happening on the backend.

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]
 
Copyright © . NexPort Solutions Engineering Blog - Posts · Comments
Theme Template by BTDesigner · Powered by Blogger