Thursday, November 14, 2013

NexPort Campus Moves to Fluent Validation

- 0 comments
Starting with NexPort Campus v5.1 NexPort will support both Castle Validation and Fluent Validation. Castle Validation will be completely phased out by v6.0 and replaced by Fluent Validation.
Fluent Validations are created in a similar manner to an Nhibernate Mapping. Here is an example of a model entity with its mapping and validator.
    using FluentValidation;
    using FluentValidation.Attributes;

    [Validator(typeof(ValidationTestEntityValidator))]
    public class ValidationTestEntity : ModelBase
    {
        public virtual String Phone { get; set; }
        public virtual string CreditCard { get; set; }
        public virtual int NumGreaterThan7 { get; set; }
        public virtual int NumBetween2And27 { get; set; }

    }


    public class ValidationTestEntityMap : FluentNHibernate.Mapping.ClassMap<validationtestentity>
    {
        public const string TableName = "ValidationTestEntity";

        public ValidationTestEntityMap()
        {

            Table(TableName);
            Id(x => x.Id)
                .GeneratedBy.Assigned()
                .UnsavedValue(new Guid("DEADBEEF-DEAD-BEEF-DEAD-BEEFDEADBEEF"));

            Map(e => e.Phone);
            Map(e => e.CreditCard);

            Map(e => e.NumGreaterThan7);
            Map(e => e.NumBetween2And27);


        }
    }

    public class ValidationTestEntityValidator : AbstractValidator<validationtestentity>
    {
        public ValidationTestEntityValidator()
        {
            RuleFor(e => e.CreditCard).CreditCard().NotEmpty();

            RuleFor(e => e.NumGreaterThan7).GreaterThan(7);
            RuleFor(e => e.NumBetween2And27).GreaterThan(2).LessThan(27);

        }
    }

Simple entity validation can be performed anytime by instantiating the validator directly and testing the result:
                var validator = new ValidationTestEntityValidator();
                var result = validator.Validate(entity);
Validation can also be performed anytime by using the Validation Factory:
               var factory = new FluentValidatorFactory();
                var validator = factory.CreateInstance(entity.GetType());
                validator.Validate(entity);
The ValidatorAttribute is applied to the model entity to let the ValidatorFactory know which validator to use.
    
    [Validator(typeof(ValidationTestEntityValidator))]
    public class ValidationTestEntity : ModelBase
    {
When saving or updating an entity to an Nhibernate session there is no need to validate it first. A validation check is performed in the ModelBaseEventListener for all updates and inserts. If the entity fails to validate then a ValidationExcpetion will be thrown. Until v6.0 the ModelBaseEventListener will validate against both the Fluent and Castle validation frameworks.
    using (var session = NHibernateHelper.OpenSession())
            {
                session.BeginTransaction(IsolationLevel.ReadUncommitted);
                var testObj = session.Load<ValidationTestEntity>(id);
                // THIS IS NOT A VALID CREDIT CARD NUMBER
                testObj.CreditCard = "123667";

                // A VALIDATION EXCEPTION WILL BE THROWN BY COMMIT
                session.Transaction.Commit();
                
            }

The ModelBaseEventListener uses the NexPort FluentValidationFactory to create an instance of the proper validator. The factory stores singleton validator references in a ConcurrentDictionary in order to mitigate the performance hit incurred by constructing new validators.

In my next article, I will discuss using Fluent Validation to validate model entities on the client side. In the meantime, please check out the Fluent Validation Documentation.

About NexPort Solutions Group

NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Wednesday, October 2, 2013

Database Table Mapping - Fluent NHibernate

- 0 comments
Maintaining database data and interactions is a tough and time-consuming task. Object Relational Mappers (ORMs) were devised to solve precisely that problem. They allow developers to map database tables to object-oriented classes and use modern refactoring tools to make changes to the code-base. This allows for more rapid development and streamlined maintenance.

Initially, we used Castle ActiveRecord as an abstraction layer for our database. This allowed us to use attributes to map the properties of our object entities to columns in the database tables. When we decided  to move away from ActiveRecord due to lack of development progress, we used a tool to generate straight NHibernate XML mapping files from the existing mappings. These were all well and good until they actually had to be edited. Every change to a property required us to track down the related XML file and update the mapping information.

We decided to try out Fluent Nhibernate for our mappings. Fluent used C# Linq Expressions to define the relationships between entity tables and their behavior when updating the database. The beauty of Fluent was that we could still keep the XML mapping files around while we slowly moved to the more maintainable scheme. We did this by adding the configuration code shown below.

public static Configuration ApplyMappings(Configuration configuration)
{
   
     return Fluently.Configure(configuration)
         .Mappings(cfg =>
         {
             cfg.HbmMappings.AddFromAssemblyOf<Enrollment>();
             cfg.FluentMappings.AddFromAssemblyOf<Enrollment>();
         }).BuildConfiguration();
}

Then, objects could be easily mapped in code rather than using attributes or XML. We could map collections and even specify cascade behaviors for the related objects. See the example of mapping basic properties, collections and related objects below.

public class Syllabus
{
      public virtual Guid Id { get; set; }


      public virtual string Title { get; set; }


      public virtual IList<enrollment> Enrollments { get; set; }
}

public class SyllabusMapping : FluentNHibernate.Mapping.ClassMap<Syllabus>
{
      public SyllabusMapping()
      {
           Table("Syllabus");

           Id(x => x.Id)
               .GeneratedBy.Assigned()
               .UnsavedValue(new Guid("DEADBEEF-DEAD-BEEF-DEAD-BEEFDEADBEEF"));

           Map(x => x.Title).Column("Title");

           HasMany(x => x.Enrollments)
               .KeyColumn("Syllabus")
               .LazyLoad()
               .Inverse()
               .Cascade.Delete();
     }
}

public class Enrollment
{
     public virtual Guid Id { get; set; }


     public virtual Syllabus Syllabus { get; set; }


     public virtual ActivityResultEnum Result { get; set; }
}

public class EnrollmentMapping : FluentNHibernate.Mapping.ClassMap<Enrollment>
{
     public EnrollmentMapping()
     {
          Table("Enrollments");

          Id(x => x.Id)
               .GeneratedBy.Assigned()
               .UnsavedValue(new Guid("DEADBEEF-DEAD-BEEF-DEAD-BEEFDEADBEEF"));

          Map(x => x.Result).Column("Result").CustomType<ActivityResultEnum>();

          References(x => x.Syllabus)
               .Column("Syllabus")
               .Access.Property()
               .LazyLoad();

     }
}

One problem we faced was that our object graph used inheritance heavily. We were a little apprehensive about how complex it would be to translate that to Fluent. Fortunately for us, the solution was relatively straightforward. It required creating a base mapping class and using a discriminator column to distinguish the types.

public class SyllabusMapping : FluentNHibernate.Mapping.ClassMap<Syllabus>
{
     public SyllabusMapping()
     {
         Table("Syllabus");

         Id(x => x.Id)
             .GeneratedBy.Assigned()
             .UnsavedValue(new Guid("DEADBEEF-DEAD-BEEF-DEAD-BEEFDEADBEEF"));

         Map(x => x.Title).Column("Title");

         HasMany(x => x.Enrollments)
             .KeyColumn("Syllabus")
             .LazyLoad()
             .Inverse()
             .Cascade.Delete();

         DiscriminateSubClassesOnColumn<SyllabusTypeEnum>("SyllabusType", SyllabusTypeEnum.Unknown); // Allows sub-classes
     }
}


public class Section : Syllabus
{
     public virtual string SectionNumber { get; set; }
}

public class SectionMapping : FluentNHibernate.Mapping.SubclassMap<Section>
{
     public SectionMapping()
     {
         DiscriminatorValue(SyllabusTypeEnum.Section);

         Map(x => x.SectionNumber).Column("SectionNumber");
     }
}

public class TrainingPlan : Syllabus
{
     public virtual int RequirementTotal { get; set; }
}

public class TrainingPlanMapping : FluentNHibernate.Mapping.SubclassMap<TrainingPlan>
{
     public TrainingPlanMapping()
     {
         DiscriminatorValue(SyllabusTypeEnum.TrainingPlan);

         Map(x => x.RequirementTotal).Column("RequirementTotal");
     }
}

Moving forward, this will allow us to refactor code more easily and maintain the system while adding new features. We will be able to focus on new features rather than spending all our time searching for enigmatic mappings. With Fluent NHibernate, we were able to move from XML files to a robust, refactor-friendly solution. Win.

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Building a mobile site with JQueryMobile

- 0 comments
With the sales of tablets to eclipse computer sales within the next few years, as well as the growing adoption of touchscreen computers, we decided to create a version of the UI that provides better support for touch screens via larger interactive elements, as well better scalability over a range of resolutions.

To accomplish this, we decided to go with JQueryMobile. JQueryMobile is a natural choice since we are already making use of both JQuery and JQueryUI. Integration between the three of them seems to be rather good.

The Good

The JQueryMobile ThemeRoller will allow for the easy creation of themes, which will now effect more elements then before, providing organizations with greater customization capabilities.

JQueryMobile does a good job of handling a multitude of resolutions. It (mostly) manages to reflow intelligently to provide for a good user interface even if the screen resolution is limited.

JQueryMobile manages to provide good compatibility with legacy browsers, opening the possibility of providing a unified UI across both the desktop, as well as mobile market.


The Bad

JQueryMobile provides limited grid support, and does not play terribly nice with other grid frameworks. We are now using rwdgrid, and some resolutions have already caused issues with forms and had to be manually tweaked.

A lot of the currently existing styles clash with new styles. Determining which styles need to go and which can stay is troublesome.

JQueryMobile should be loaded at the top of the page to provide its styling as the page is loaded, rather then modifying it afterwards. At the same time, JQueryUI seems to not play nice if it is loaded after JQueryMobile and now needs to be loaded in the beginning, too.



In the long run, JQueryMobile seems to be a great choice for providing a mobile interface, and possibly a unified interface. It makes creating a mobile website almost as simple as generating a normal website and loading a few extra styles ans scripts.

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Friday, September 6, 2013

A Take On Inheritance

- 0 comments
As a software developer, translating ideas into code is a day-to-day job. The object-oriented programming (OOP) paradigm is widely used. Its strengths include abstraction, encapsulation and polymorphism, to name a few. OOP loses its flexibility if implementing the code does not account for limitations. Although inheritance looks straight forward when we want to reuse code or implement polymorphism, it may end up being difficult to maintain code.

Let's look at an example. If we have to implement an assignment entity, we know all assignments have common properties like name, due date and time, scores and so on. In our LMS (NexPort), both course assignments (providing content to read) and test assignments can be launched by a student without an instructor's involvement. So, we add a method or property to launch assignments in the base class. Now we have to implement writing assignments or discussion assignments that have to be moderated by an instructor. These assignments will inherit methods or properties related to the launching behavior but this is functionality that they DO NOT support. Further compounding our problem is the instructor role, should this be supported in the base class? In such a scenario, the code becomes unnecessarily coupled. 

This question remains: how should inheritance be implemented? First, inheritance hierarchy is always an "is-a" relationship and not a "has-a" relationship. Whenever we have a "has-a" relationship, it should be composition of objects not inheritance. Secondly, inheritance hierarchy should be reasonably shallow and the developer has to make sure other developers are not likely to add more levels.  

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Friday, August 23, 2013

Full-Text Search - Part 3

- 0 comments
In Full Text Search - Part 2, we discussed how we used bare-bones objects for the user management search. Unfortunately, our reporting system required a much more complex solution. Our administrators were becoming increasingly impatient with NexPort Campus' slow reporting interface. This was further compounded by the limited number of reportable data fields they were given. In an attempt to alleviate these concerns, we spiked out a solution using Microsoft Reporting Services as the backbone running on a separate server. After discovering the limitations of that system, we moved to using SQL views and replication. When replication failed again and again, we revisited Apache Solr for our reporting solution.

We began designing our Solr implementation by identifying the reportable properties we needed to support in our final object graph. The object graph included multiple levels of nesting. The most specific training record entity assignment status contained the section enrollment information, which in turn contained the subscription information, which in turn contained the user information. We wanted to be able to report on each level of the training tree. Because of the inherent flat document structure of Apache Lucene, it did not understand the complex nesting of our object graph. Our first idea was to flatten it all out.

public class User
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string FirstName { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string LastName { get; set; }
}

public class Subscription
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual DateTime ExpirationDate { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid UserId { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string UserFirstName { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string UserLastName { get; set; }
}

public class SectionEnrollment
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual int EnrollmentScore { get; set; } // Cannot use Score, as that is used by Solr

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid SectionId { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid SubscriptionId { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual DateTime ExpirationDate { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid UserId { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string UserFirstName { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string UserLastName { get; set; }
}

public class AssignmentStatus
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual int StatusScore { get; set; } // Cannot use Score, as that is used by Solr

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid AssignmentId { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid SectionEnrollmentId { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual int SectionEnrollmentScore { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid SectionId { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid SubscriptionId { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual DateTime ExpirationDate { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual Guid UserId { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string UserFirstName { get; set; }

 [SolrField(Stored = true, Indexed = true, LowercaseCopy = true, TokenizedCopy = true)]
 public virtual string UserLastName { get; set; } 
}

This was an incredible amount of duplication, repetition and fragmentation. To add a reportable property for a user required a change to the subscription object, the section enrollment object and the assignment status object. The increased maintenance overhead and probability for making a typo was a potential deterrent to adding new reportable data to the system.

So, to keep our code DRY (Don't Repeat Yourself), we decided to mirror the nesting of our object graph by using objects and attribute mapping to generate the schema.xml for Solr. We populated the data by calling SQL stored procedures using NHibernate mappings. Because we used the same objects for populating as we did for indexing, we had to keep the associated entity IDs on the objects.

public class Subscription
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual DateTime ExpirationDate { get; set; }

 public virtual Guid UserId { get; set; } // Required for populate stored procedure

 [SolrField(Prefix = "user")]
 public virtual User User { get; set; }
}

public class SectionEnrollment
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual int EnrollmentScore { get; set; } // Cannot use Score, as that is used by Solr

 public virtual Guid SectionId { get; set; } // Required for populate stored procedure

 public virtual Guid SubscriptionId { get; set; } // Required for populate stored procedure

 [SolrField(Prefix = "subscription")]
 public virtual Subscription Subscription { get; set; }
}

public class AssignmentStatus
{
 [SolrField(Stored = true, Indexed = true, IsKey = true)]
 public virtual Guid Id { get; set; }

 [SolrField(Stored = true, Indexed = true)]
 public virtual int StatusScore { get; set; } // Cannot use Score, as that is used by Solr

 public virtual Guid AssignmentId { get; set; } // Required for populate stored procedure

 public virtual Guid EnrollmentId{ get; set; } // Required for populate stored procedure

 [SolrField(Prefix = "enrollment")]
 public virtual SectionEnrollment Enrollment { get; set; }
}

This resulted in less code and achieved the same effect by adding "." separators to the schema.xml fields. For example, we used "enrollment.subscription.user.lastname" to signify the user's last name from the assignment status report. Because of this break from the JSON structure, we had to write our own parser for the results that Solr returned. We achieved this by tweaking the JSON parser we already had in place to accommodate "." separators rather than curly braces.

With our object graph finalized and the Solr implementation in place, we began to address the nested update locking issue we had discussed in Full-Text Search - Part 1. We solved this problem in the new system by adding SQL triggers and an update queue. When an entity was inserted, updated or deleted, the trigger inserted an entry into its queue table. Each entity had a separate worker process that processed its table queue and queued up related entities into entity-specific queue tables. This took the work out of the user's HTTP request and put it into a background process that could take all the time it required.

To lessen the user impact even more, the trigger just performed a straight insert into the queue table without checking if an entry already existed for that entity. This had a positive impact for the user but meant that Solr would be hammered with duplicate data. To avoid the unnecessary calls to Solr, we used a distinct clause in our SQL query that returned the top X number of distinct entities and recorded the time stamp of when it occurred. After sending the commands to Solr to update or delete the entity, it then deleted any entries in the queue table with the same entity ID that were inserted before the time stamp.

Solr full-text indexing, coupled with a robust change tracking queue and an easily-implemented attribute mapping system provided us with a solid reporting backend that could be used for all our reporting requirements. We still had to add an interface to use it, but most of the heavy lifting was done. Full-text search was implemented successfully!

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Monday, August 12, 2013

Multiple Session Factories and the Second Level Cache

- 0 comments
In a previous post, we discussed our approach to delaying the delete operation such that the user does not have to pay the price by waiting for the operation to finish. Instead, we set the IsDeleted flag to be true and queue up a deletion task. It has worked well for us; although, we have run into a few issues. Let's look at how multiple session factories interact with the second level cache.

Before we start, let's have a quick look at the NHibernate caching system. NHibernate uses the following caches:
  • Entity Cache
  • Query Cache
  • Collections Cache
  • Timestamp Cache
The issue we are running into is that, by default, NHibernate will clear out the proper cache based on what entities are being inserted and deleted. Let's look at this query.

// Queries DB, inserts into cache
session.QueryOver<Users>().Where(u => u.FirstName = "John").Cachable();
// Pulls result from cache
session.QueryOver<Users>().Where(u => u.FirstName = "John").Cachable();

When NHibernate receives the result from the database, it will store the entities in the entity cache, while storing the set of returned IDs in the query cache. When you perform the query again, It pulls the list of IDs from the query cache and then hydrates each entity from the entity cache.

Now suppose we delete a user in between performing both of these queries, or perhaps create a new one.

// Queries DB, insert into cache
session.QueryOver<Users>().Where(u => u.FirstName = "John").Cachable();
// Marks as stale in cache
session.Delete(john);
// Queries DB again
session.QueryOver<Users>().Where(u => u.FirstName = "John").Cachable();
 
NHibernate would take notice and not pull the second query from the query cache but instead return to the database for the latest information. In this way, NHibernate seems to do a rather great job at taking care of the cache. For a bit more information, have a look at this post by Ayende.

So, if instead, we create the two sessions (in the example above) from different session factories but with identical configurations, then the second level cache will be shared and still be used. But if the delete is performed between, then the second query will still hit the cache.

// Queries DB, insert into cache
session1.QueryOver<Users>().Where(u => u.FirstName = "John").Cachable();
// Marks query and entities as stale in cache
session1.Delete(john);
// Does not notice that session1 marked it as stale, pulls from cache
session2.QueryOver<Users>().Where(u => u.FirstName = "John").Cachable();

It would seem that the sharing of the timestamp cache should take care of this. Perhaps, the timestamp cache is not shared between the factories.

The cache is not designed to be shared between session factories. Normally, chances of key collisions are low due to the use of the entity GUID in the key. But since we create multiple session factories to access the same database, or if you used an incrementing int as the key, key collisions are possible. Most of the time, you could use the region prefix as shown in the blog post or at the bug report.

Where does this leave us? Due to the fact that the DeleteVisibleSessionFactory is only used to access entities about to be deleted, we decided that caching these entities is pointless and disabled caching on it. This prevents it from retrieving any stale data. The last issue is that an entity deleted in the DeleteVisibleSession will not be removed from the second level entity cache. Now we are clearing the entity cache manually after any delete in the event listeners.

NHibernateHelper.EvictEntity(@event.Entity as ModelBase2);

Due to the granular nature of our query cache, we decided to manage them on a per case basis. They often contain the ID of a parent object and need to be cleared individually. This provides us with the best compromise of complexity and performance. Letting us know that the entity cache will be managed properly by NHibernate and that the query cache is our responsibility.


About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Wednesday, July 31, 2013

Handling NHibernate Entity Without Session

- 0 comments
NHibernate provides the "session" as a persistence manager interface. The session is used for adding, retrieving and updating any information (viz. entity) in the database. Objects are persisted to the database with the help of the session. A session has many responsibilities, including maintaining the database, transaction and entities context information. In most cases, detached or transient entities can be attached to a session, but sometimes we are unable to determine the entity state, especially if the session it was originally instantiated by has been closed. For these instances, NHibernate sessions provides us with the merge method.

When merging an entity into a session here is what happens:
  • If the current session has an entity with the same ID, the changes made to this detached entity are copied to the persistent entity in the session, and it returns the persistent entity.
  • If the current session does not have any entity with the same ID, it loads the entity from the database. The changes are copied to the persistent entity, and it returns the persistent entity.
  • If the entity does not exist in the database, the current session creates a new persistence entity and copies data from the detached entity.

The merge always returns the same entity that was passed in but the entity now contains all the "merged" changes and is associated with the current session. When the current session transaction is committed, all the changes are saved in the database. The sample code demonstrates how merge can be used. (There is an assert statement to show how the merged entity is different from the passed entity.)

//
//
//
user.FirstName = "Brad";
user.LastName = "Henry";
user.Email = "bradhenry@nexportsolutions.com";

using(var session = SessionFactory.OpenSession())
{
     session.BeginTransaction();
     var mergedUser = session.Merge(user);
     session.Transaction.Commit();
     Assert.False(ReferenceEquals(user,mergedUser));
}
//
//
//

 public User CreateUser()
  {
 var user = new User
  {
   FirstName = "Robert",
   LastName = "Smith",
   Email = "robertsmith@nexportsolutions.com"
  }
 using (var session = SessionFactory.OpenSession())
 {
  session.BeginTransaction();
  session.Save(user);
  session.Transaction.Commit();
  session.Evict(user);
 }
 return user;
}

In conclusion, the session merge enables an entity to be persisted even if that entity is not associated with the current session. This gives the flexibility of processing an entity without an open session. The developer can get an entity with a session, free the session immediately and save the processed entity with another session. Thus, the session is short-lived.

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Thursday, July 25, 2013

Full-Text Search - Part 2

- 0 comments
In my previous post Full-Text Search - Part 1, I discussed our reporting solution using NHibernate Search and MSMQ. NHibernate Search had been dead in the water for months when our administrators began complaining that user search results were extremely slow. We were using NHibernate to do a joined query on two tables with a "like" condition on four of the columns. Needless to say, it was not the most efficient operation.

Our development team decided to revisit Full-Text Search and scrap NHibernate Search. Instead, we wanted to develop a service that we could communicate with over HTTP that would update Lucene documents. We were researching how this could be done when we found Apache Solr. It did exactly what we wanted to do with our proposed service but also had the benefit of a strong user base. We scrapped the idea of creating our own service and decided to go with the proven solution instead.

Because of our unfamiliarity with the product, we postponed the automation of Solr installation for a later release and installed manually. Solr was written in Java and needed a Java servlet in which to run. Apache Tomcat was the recommended servlet, so we installed that first by following the installation prompt. Solr installation simply required that we copy the solr.war file from their download package into the lib folder of the Tomcat directory. All of the other entity-specific work was more easily automated.

The remaining automation tasks were generating the Extensible Markup Language (XML) files for the entities we were indexing and copying over configuration files. The Tomcat context XML files were required to be able to communicate with Tomcat the location of the indexes on the disk. The schema XML file was required as a contract for Solr to be able to correctly construct the Lucene document and for us to be able to communicate changes to those documents.

To avoid processing our complex object graph, we decided not to tie the index processing to our existing entities. Instead, we created a separate entity that only had the four fields we cared about: First Name, Last Name, Login Name and Primary Email. We wrote our own SolrFieldAttribute class that included all the flags needed to create the Solr schema XML file for the entity. We also created a SolrClassAttribute to indicate the classes for which to create the schema XML file and to create the Tomcat context file.

Initially, we only stored the four fields that were going to be displayed in the result list and that we were going to use to filter the results. This allowed us to process the documents for these lightweight entities much more quickly. Unfortunately, our system also had to restrict the users an administrator could see based on the administrator's permission set. To address this problem, we used Solr's multi-valued property to index all of the organizations in which the user had a subscription. From the code's standpoint, we added another property to the SolrFieldAttribute that indicated to our schema generator that it needed to add multiValued="true" to the schema field.

To provide the best user results list, we also had to add a field to the entity called Full Name that was the First Name and Last Name separated by a space. This would allow a user to enter "Sally Sunshine," and she would be the top user in the results list.

To make the update calls to Solr, we used a background job scheduling service that we already had in place. When an object changed that needed to be indexed, we created an entry in a table with the object's id and date updated. The scheduling service job, called the indexing job, then pulled rows from the table, converted them to Solr JSON objects and sent HTTP calls to Solr with the JSON object. The first problem we encountered was that the JSON used by Solr was not standard JSON. We had to write our own converter instead of using a standard converter like Newtonsoft. Notice in the examples below that Solr JSON does not handle arrays in the same way as standard JSON.

Example of standard JSON:

{
     "id":"a017a879-a2f2-4e37-811d-8f13ace56819",
     "firstname":"Sally",
     "lastname":"Sunshine",
     "fullname":"Sally Sunshine",
     "primaryemail":"ssunshine@sunshine.net",
     "loginname":"ssunshine",
     "organizationid":
     [
          "c04eb269-3a81-442f-85b5-23b4bd87d262",
          "87466c69-1b8f-4895-9483-5c1cb55ecb2b"
     ]
}

Example of Solr JSON:

{
     "id":"a017a879-a2f2-4e37-811d-8f13ace56819",
     "firstname":"Sally",
     "lastname":"Sunshine",
     "fullname":"Sally Sunshine",
     "primaryemail":"ssunshine@sunshine.net",
     "loginname":"ssunshine",
     "organizationid":"c04eb269-3a81-442f-85b5-23b4bd87d262",
     "organizationid":"87466c69-1b8f-4895-9483-5c1cb55ecb2b"
}

After successfully writing our speciaized JSON converter, we ran into problems with filtering the results. The beauty of Full-Text search was that it could return close results instead of just exact matches, but we were not getting the results we expected. The results returned by Solr were sorted by the score of how well the result matched the query criteria. To get the best possible match, we had to add weights to each of the filters. Exact matches got the most weight, split term matches got the next highest weight and wildcard matches got the least weight.

To avoid problems with case sensitivity, we added a Lowercase property to the SolrFieldAttribute that indicated to Solr to automatically add a lowercase version of the indexed field in the document. It did this by having two versions of the field in the schema XML file in addition to a copy field attribute. We also added a Tokenized property to the SolrFieldAttribute that worked in similar way but broke on spaces rather than making the field lowercase. The generated schema fields can be seen below.


<fields>
 <field name="indexdate" indexed="true" type="date" stored="true" multiValued="false" default="NOW"/>
 <field name="id" indexed="true" type="uuid" stored="true" required="true"/>
 <field name="firstname" indexed="true" type="string" stored="true"/>
 <field name="firstname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="firstname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="lastname" indexed="true" type="string" stored="true"/>
 <field name="lastname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="lastname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="fullname" indexed="true" type="string" stored="true"/>
 <field name="fullname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="fullname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="primaryemail" indexed="true" type="string" stored="true"/>
 <field name="primaryemail.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="primaryemail.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="loginname" indexed="true" type="string" stored="true"/>
 <field name="loginname.lowercase" indexed="true" type="lowercase" stored="false"/>
 <field name="loginname.tokenized" indexed="true" type="text_en" stored="false"/>
 <field name="organizationid" indexed="true" type="uuid" stored="false" multiValued="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<copyField dest="firstname.lowercase" source="firstname"/>
<copyField dest="firstname.tokenized" source="firstname"/>
<copyField dest="lastname.lowercase" source="lastname"/>
<copyField dest="lastname.tokenized" source="lastname"/>
<copyField dest="fullname.lowercase" source="fullname"/>
<copyField dest="fullname.tokenized" source="fullname"/>
<copyField dest="primaryemail.lowercase" source="primaryemail"/>
<copyField dest="primaryemail.tokenized" source="primaryemail"/>
<copyField dest="loginname.lowercase" source="loginname"/>
<copyField dest="loginname.tokenized" source="loginname"/>

The initial indexing of entities in the database took some time. We wrote a separate job that went through the Users table and added rows to the update table that the indexing job was monitoring. After the initial indexing, each NHibernate update for a user inserted a row into the update table. Because the indexing job processed hundreds of entities at a time, the user documents closely mirrored the changes to the database.

Our administrators were very happy with the speed of our new Full-Text user management search. After performing a search, the administrator could then click the user's name in the list to be taken to the user's full profile. This increase in efficiency caused administrators to begin to wonder where else this magical solution could be applied.

In Part 3, we'll bring this series full-circle and discuss how Full-Text Search was finally able to make its way into our reporting system. Stay tuned!

About NexPort Solutions Group

NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Friday, July 19, 2013

Using Google Spreadsheet as a Burndown Chart

- 0 comments

Introduction

No one in our shop questions the value of having a highly visible burndown chart in order to track our daily progress, but as we move to a more distributed team with developers working remotely or from home, we are presented with some challenges in both updating and making the burndown chart available.

Discussion

Our teams will generally update the burndown charts daily. We simply add the number of remaining story points (for the current release) as recorded in our issue tracker and mark that number on our burndown chart. The chart was usually kept in the hallway or some other room where it was highly visible.

With distributed team members, we had to find a way to share a burndown chart easily. We tried several methods and finally decided to go with a Google spreadsheet. The spreadsheet allows us to display a chart for our burndown, update it daily and share it with management or testing.

Sample of the Completed Burndown

Step 1: Create a Google Spreadsheet

You'll need to start by creating a new Google spreadsheet with two sheets. The first sheet should be named chart and the second sheet named data.

Detail of the 2 sheets

Step 2: Setup the Data Columns

Select the DATA sheet, and add the columns that you will need for your chart; Sprint, Date, Actual, Projected, Ideal. Sprint is used for reference and does not display in the burndown chart.


 Setup the Chart. Switch to the CHART sheet and from the Insert menu, select the CHART option.

In the chart edit dialog, set the data range to "DATA!B1:E100" and select line chart as the type. Select OK or update.



Updating your Chart Data 

Now you are ready to begin using your new burndown chart. We usually fill in the Ideal column with values that give us a straight path from our starting value all the way to zero. The Actual column is filled in every day with the actual number of REMAINING story points on that day. After you have completed a few iterations (or sprints), you can begin calculating a projected path based on your current velocity.


Conclusion

Using some of the sharing and publishing features built into Google Apps, you should be able to share read-only versions of your burndown chart. So far, we have had great luck with this method, and it has worked well for our entire team.

About NexPort Solutions Group

NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Monday, July 15, 2013

NHibernate Delayed-Deletes

- 0 comments

We strive to provide our users with a swift and responsive user interface. No one likes waiting around for their computer to do something, especially if user input is not necessary. One such case would be when a user asks for something to be deleted. There is no need to have the user wait for the delete to finish. This work can be done in the background.

In the interest of increasing performance on the front-end, we utilize a seperate delete queue rather than deleting entities in the user’s session. Deletions seem almost instantaneous to the user even if they take a while due to cascading deletes and the creation of numerous audit logs. Of course we don't want the deleted entities showing up in the user interface while the delete queue is doing it's work. So, we just need to filter the results so we do not present any entities which have been marked for deletion. What we are looking for is a way to perform a soft delete with the ability to find deleted entities when necessary; in the delete queue for instance.

We use Nhibernate for our Data Access Layer (DAL). So, the brute force approach to this problem would be to add a where clause to the end of each query. With the number of queries used by the system, this seems to be a rather poor plan. Also, collections and foreign key relations would not include the option to add a where clause. This approach would give us great performance, since we would only need to perform the "where" where needed. At the same time, there would be a good chance of missing a query.

The standard way to perform soft deletes in NHibernate seems to be to add a where clause to the class tag in the mappings. This makes it much more difficult to access data marked for deletion but not yet deleted. With the class mapping containing the where clause, no where will ever return a marked entity. One could add a subclass with the where clause, allowing access to the deleted data via the parent. Unfortunately, this also seemed to add unnecessary code and complexity.

So, in our research, we found another really nice feature of NHibernate. We discovered, while working on the "where" filters, NHibernate filters.  This feature “allows us to very easily create global where clauses that we can flip on and off at the touch of a switch.” Perfect! Just what we needed. Turn the filter on by default and then turn it off as needed by simply calling session.DisableFilter. So we added the filter to one object and enabled it in the OpenSession call.

The filter def:
<filter-def name="NorMarkedForDeletion">
    </filter-def>
The code to add to each class:
<class name="User" table="Users">
    ...
    <property name="MarkedForDeletion">
      
    <filter condition="MarkedForDeletion <> 1" 
        name="NotMarkedForDeletion">
</class>
Add the filter to the OpenSession call.
public static ISession OpenSession(string sessionName = "Nexport Session")
{
    var session = SessionFactory.OpenSession();
    session.FlushMode = FlushMode.Commit;

    if (AppSettingsKeys.Cluster.EnableNHibernateProfiler)
        HibernatingRhinos.Profiler.Appender.NHibernate.NHibernateProfiler.RenameSessionInProfiler(session, sessionName);

    session.EnableFilter("NotMarkedForDeletion");

    return session;
}
Nothing like an elegant solution. But oh, look, about a third of the unit tests fail, with rather strange and random error messages, when the test harness is run. Maybe this was not the best solution.

After the filters approach failed and an attempt at fixing the tests turns out to be not a quick fix, it is back to the drawing board. Adding a where clause to the class mapping seemed to work rather well. It just made it impossible for us to access the marked data afterwards, preventing us from actually deleting it from the system. But if, instead, we create two instances of the session factory, one with the where clause on the mappings, and the other without, we would be able to control the visibility of marked objects.  We are losing some of the flexibility we would have had with the filters, allowing us to switch at any point in the session, but we still have session by session control.

Just need to create two session factories:
    private static ISessionFactory SessionFactory { get; set; }
    private static ISessionFactory DeleteSessionFactory { get; set; }

    public static void Init()
    {
        ...

        DeleteSessionFactory = cfg.BuildSessionFactory();

        foreach (Type deleteMarkable in System.Reflection.Assembly.GetExecutingAssembly().GetTypes().Where(mytype => mytype.GetInterfaces().Contains(typeof(IDeleteMarkable))))
        {
            var classMapping = cfg.GetClassMapping(deleteMarkable);
            try
            {
                classMapping.Where = "MarkedForDeletion != 1";
            }
            catch (Exception ex)
            {
            }
        }

        SessionFactory = cfg.BuildSessionFactory();
    }

One thing to touch on is cascading. When marking an entities for deletion, we want to mark all children for deletion, too. However, at that point, just deleting all the child entities would be just as fast. As such, we mark some, but not all, of the children as deleted, as well. Some entities are impossible or at least almost impossible to reach without the parent entity.

In the end, creating two session factories worked rather well. A few tests failed at first, but they were quickly rectified by replacing the default session with one that can access marked entities in a few places, mostly in the audit logging triggered by a delete call on the object. It is still a mystery as to why the filters caused the issues they did. Now we have a system which filters out objects marked for deletion such that the user is not able to see them, while still giving us the flexibility of NHibernate when deleting these objects.  Furthermore, having other entities implement the deletable interface should now be even simpler. They need only to implement the interface and add the MarkedForDeletion property and column, and everything else will handled automatically.

These changes greatly increase the speed of deletions from the users' perspective. Rather than waiting for the entity and all of its associations to be deleted, a delete will now be nothing more then a few SQL update calls with everything else happening on the backend.

About NexPort Solutions Group
NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Friday, June 28, 2013

Full-Text Search - Part 1

- 0 comments
The journey with full-text search has not been an easy one. Two and a half years ago, we began looking for a faster way to do searches across multiple related database tables. Our primary target was to improve the performance of reporting on student data in our Learning Management System, NexPort Campus. We were already using the Castle ActiveRecord library to map our C# classes to our database tables with attributes. ActiveRecord is built upon NHibernate, a popular .NET Object Relational Mapper (ORM). Because we were already used to attribute mapping, it made sense to try to use a similar toolset for full-text search. Enter NHibernate Search.

NHibernate Search was an extension of NHibernate built upon a .NET port of the Java full-text search engine Apache Lucene called Lucene.NET. Similar to ActiveRecord, NHibernate Search used attribute mapping to designate what should be included in the documents stored in the Lucene.NET's document database. (A document is a collection of text fields that is stored in a denormalized way to make queries faster.)

At the time of our design implementation, Lucene.NET was several versions behind its ancestor. This should have been a red flag, as was the fact that NHibernate Search had not had a new release in some time. Despite these troubling indicators, we plowed on. We started by mapping out all of the required properties to sustain our previous reporting SQL backend. Our model is quite complex, so this was no easy task. Primitive types and simple objects such as String and DateTime used a Field attribute and user-defined objects used a IndexedEmbedded attribute. In addition to the basic attributes required by NHibernate Search, we also had to write separate IFieldBridge implementations and include the FieldBridge attribute on each property. Needless to say, our class files exploded with non-intuitive code.

NHibernate Search used the attributes for their listeners to determine when an object changed and needed to be re-indexed. If a related object changed, it would then trigger the next object to be processed all in the same session. For our case, one of our indexed objects was a training record object, section enrollment. If a user object changed, it would trigger both itself and all related subscriptions to be re-indexed, which then triggered all section enrollments to be re-indexed. This led to a very large problem in our production system, which I will detail in a bit.

The whole idea of this undertaking was to decrease load on the database server while making search and reporting results faster. To that end, we put the indexing work on a separate machine. To communicate the documents to be indexed, we used Microsoft Messaging Queue (MSMQ) and wrote our own specific backend queue processor factory for NHibernate Search. When an object changed, it would be translated by NHibernate Search into a LuceneWork object. These LuceneWork objects were then serialized into a packet that MSMQ could handle. If the packet was too large, it was split into multiple packets and re-assembled on the other side. MSMQ worked fine when the machines were on the same domain. However, when we went to our Beta system, cross-domain issues began to crop up. After hours of research and trial-and-error, we finally were able to solve the problem by tweaking the Global Catalogue in the domain controller.

To make reads even faster, we implemented a Master-Slave relationship with our indexes. One master index was for writes, and there could be one or more slave indexes to read from. In our first attempt, we used Microsoft's Distributed File System (DFS) to keep the slaves updated from the master. We quickly ran into file-locking problems, so we went to a code-based synchronization solution. We used the Microsoft.Synchronization namespace to replicate the data, ignoring specific files that were causing locking problems.

The file synchronization code was the last piece of the puzzle. After spending months working on the new full-text search reporting backend, it was finally time to release the product. Remember the large problem I mentioned earlier? Well, as soon as users started logging into the system, the extra processing added by NHibernate Search brought the server to its knees. It took minutes for people to do something as simple as login to the system. We immediately had to turn off the listeners for NHibernate Search and re-release NexPort Campus. It was a complete and utter disaster.

The moral of this story is not that NHibernate Search is the devil. The main problem with this solution was over-engineering. Trying to avoid future work by cobbling together too many third-party components that just did not fit well together was short-sighted and ended up being more work in the end. It made for ugly code and an unmaintainable system.

In the weeks following our disastrous release, another developer and I began to think of ways to offload the extra processing. We had some good ideas for it and were in the process of making the changes when priorities changed. Full-text search sat in an unused, half-finished state for nearly two years. When the idea came up to improve the search capability and performance of our user management system, we revisited the full-text search solution. That's when we discovered the holy grail of full-text search, Apache Solr.

For the story of how Solr saved the day, please stay tuned for my next post, Full-Text Search - Part 2.

About NexPort Solutions Group NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]

Friday, June 21, 2013

The Reason for a Blog

- 0 comments
Welcome to the inaugural blog post of the Nexport Solutions Engineering Blog. Many of you may be wondering who we are and why you would be interested in our blog. I hope to answer both points in this, our first blog post.

First, a quick story about WHO we are. Nexport Solutions is a division of Darwin Global, LLC (DG), a recent off-shoot of Advanced Systems Technology, Inc. (AST). Historically, AST has been in the business of providing software solutions and other services in the public or government sector. More than 12 years ago, AST began to explore the private or commercial sector. Recently, in an attempt to better focus on our commercial clients and partners, the commercial business split from AST into what is now Darwin Global. Nexport Solutions operates from Oklahoma City and provides Systems and Software engineering services to DG and its partners. You can find more details about the business of Nexport Solutions on the main Nexport Solutions Blog.

As an engineering group, we find ourselves solving many complex software engineering problems on a daily basis. These problems can be specific to writing code, coding patterns, configuration management, methodology development or just about anything else you can imagine that relates to our trade. I've long felt that we needed a better sounding board for our ideas and that we need to do a better job of contributing back to the engineering and software development community. So, my goal for this blog is to provide an outlet for our ideas and to share with the larger community our discoveries, opinions and general musings on our craft.

Every week, one of our engineers or QA specialists will publish an article on some topic that is relevant to our  team as a whole. We are an agile shop that develops primarily web applications in C#, HTML and Javascript. We practice test driven development as often as possible and believe in a highly automated configuration management system with Continuous Integration as it's core driver. Our stack of choice has been Castle::Monorail /ActiveRecord but we are in the process of slowly switching to MS MVC/NHibernate.

Over my last 12 years with this engineering group, I have seen our technology and methodology change dramatically. I have seen our software go from a combination of Classic ASP and ColdFusion to a fully distributed and cloud server architecture running on the .Net platform.

Finally, I am very proud of our team and the incredible software that they develop. I am regularly surprised by their ingenuity and as we begin to grow our blog I am sure you will be, as well!

About NexPort Solutions Group

NexPort Solutions Group is a division of Darwin Global, LLC, a systems and software engineering company that provides innovative, cost-effective training solutions and support for federal, state and local government, as well as the private sector.
[Continue reading...]
 
Copyright © . NexPort Solutions Engineering Blog - Posts · Comments
Theme Template by BTDesigner · Powered by Blogger