Archive for June, 2009

The Wisdom of Crowds

Tuesday, June 30th, 2009

Software development processes capture—in reusable form–the organization’s best practices and lesson learned, making them sharable across projects. Today, the benefits of process-centric software delivery are well understood. So, why is the industry’s adoption of software development processes so dismal? The lower than expected adoption can be partly explained by a phenomenon called groupthink. For this post I will rely on materials from James Surowiecki widely cited book: The Wisdom of Crowds.

The coinage of the term, groupthink, predates Surowiecki’s work, but he frames it concisely within the larger arena of group decision making, which he shows is more accurate, in most cases, rather than decisions made by subject matter experts. This easy to read book draws from and consolidates various scientific and empirical bodies of work from diverse fields–such as psychology, statistics, and economics–making the subject generally accessible.

For crowds to produce correct decisions, its members must be diverse, independent, and decentralized, and should have a mechanism to consolidate the individual judgments into collective decision. However, the decision making fails when the members of the crowd are too conscious of the opinions of others and begin to emulate each other and conform than think differently. This failure is called groupthink.

I believe commercial software methodologies have been suffering from groupthink. For over a decade, most efforts have centered around Unified Process with all participants—mainly methodology theorist and consultants-emulating each other and conforming rather than thinking differently. Any new development—such as Eclipse Process Framework or SCRUM—has been forced to fit in a UP mold. The practitioners have found these expertly devised methodologies irrelevant, and, hence, have mostly avoided them. At the same time, new practical ideas that arise during actual development projects are prevented from blossoming. The potential methods devised by diverse projects’ practitioners are likely to be more relevant, as they convey the wisdom of crowds (mobs) and consequently have a better chance of wide adoption.

The good news is that with the recent availability of integrated ALM and interactive process asset repository systems, it is now possible to involve practitioners in the end-to-end methodology development effort. I will cover this in more details shortly.

Multi-language Development

Friday, June 26th, 2009

As if the development of software systems was not complex enough. It is about to get even more complex.

In the past few years, there have been much fruitful activities in the area of programming languages resulting in introduction of various languages. Although some of these languages have been around for a number of years, the recent availability of their robust, high-performance versions has made their usage in large commercial systems possible. Broadly speaking, these new languages can be grouped into two categories: dynamic languages and functional languages.

Dynamic Languages

For many developers strong-typed languages are too restrictive, contextually unintuitive, and verbose. Dynamic typed languages, on the other hand, are simple and elegant, ideally suited for rapid prototyping and development. They are specially well suited for time-sensitive development projects where time-to-market is critical.

Functional Languages

With the advent of universal deployment of multi-core hardware and the eminent availability of cheap massively parallel systems, the need for making concurrent programming mainstream has attracted much attention. The non-imperative aspect of functional programming results in code with no side-effects, well suited for concurrent programming. Functional programming is also very efficient for implementation of certain algorithms resulting in dramatic code reduction.

Multi-language Platforms

Aside from robustness, the new versions of the above languages are design to run on common runtime environments: virtual machines. This allows co-existence of multiple languages in a single software system.

Imperative type-safe object-oriented languages—such as java and C#–dynamic languages—such as Groovy and IronRuby—and non-inperative functional languages—such as Clojure and F#–can run seamlessly together on commercially available virtual machines–such as jvm and .net clr/dlr. Therefore, no technical barriers to development of multi-language systems exist today.

Example: Mobile Application with Cloud-based business logic.

Imagine a mobile application that provides sophisticated investors with instantaneous alerts based on news on or coverage of a company. Such an application constantly scans large number of online sources for any mention of the company’s name or reference to its competitors or general macro-economic events that can affect it. The system, then, analyzes the raw inputs and provides the user with a single metric indicating the effect of the latest chatter on the company’s stock price (positive or negative). Based on this metric, the investor can take appropriate actions.

Development of this system requires both mobile client and server side development. The client side could be an iPhone application implemented in Objective-C and UIKit.

The server side is more complex. It requires a concurrent component running on a massively parallel cloud computing platform that constantly monitors and filters large number news feeds, blogs, and possibly social networking sites. The analysis of the filtered information may require specialized algorithms some aspects of which require constant enhancements. These functionalities are best implanted using combination of object-oriented, dynamic and function programming languages

Process and Multi-language Development

Because of the inherent complexity of multi-language systems, software development processes and integrated application lifecycle management (ALM) systems are critical to development of such systems. Imagine the various categories of practitioners and skills involved. Each with its own unique development, validation and verification approach and tools.

Initial Take on IBM’s Measured Capability Improvement Framework

Sunday, June 14th, 2009

Earlier this month, IBM Rational announced their Measured Capability Improvement Framework (MCIF) offering, vaguely described in the IBM’s oficial news release:

Additionally, with IBM’s new Measured Capability Improvement Framework (MCIF), organizations can also take actions to continuously improve on results by learning from past experiences. Through MCIF, IBM provides organizations with an end-to-end framework that enables them to measure results and manage projects so they can incrementally improve their software delivery capability. 

“In today’s economic climate, businesses are looking for new ways to derive greater value from their investments in software,” said Dr. Daniel Sabbah, general manager, IBM Rational Software. “Up until this point, organizations have been lax in measuring the business value and discipline of the processes [emphasis added] they use to deliver software assets. Classic metrics in software engineering largely ignore the importance of actual business outcomes. Our clients are now beginning to realize that the software they build or assemble must be treated as a strategic business asset. IBM is committed to helping them make the right decisions and improve the successful outcomes of this newly emerging business process discipline.” 

 

Measurement driven improvement is central to CMMI and CobiT has a strong emphasis on alignment of IT processes with organization’s business objectives. If you have read my blogs, you would know that I strongly believe in objective-based process definition. If anything, IBM Rational is playing catch up. But it is still nice to have their confirmation.

I didn’t attend this years Rational Software Conference, but I have been carefully studying white papers on MCIP and Rational Insight offerings. I will discuss Rational Insight in a future post.

MCIP white paper is well written and is an enjoyable read. I fully agree with the framing of the differences between business and manufacturing processes and software development processes on pages 3 and 4.

Unlike most other business processes, such as supply chain management or manufacturing, SSD needs to deal with a range of risk. SSD also differs from many other business processes in that it entails a diseconomy of scale: that is, individual productivity decreases with the size of the SSD effort. …

Software delivery differs from many other business processes by dealing with a broad range of innovation. Some software projects, such as maintenance of existing systems, are reasonably predictable, similar to manufacturing processes. Those projects carry limited innovation and drive limited or no business differentiation. Other projects, such as building unprecedented and large software systems, require high degrees of innovation in addressing problems that have never been solved before on a schedule. Committing to delivering innovation requires assuming risk, since the lack of complete knowledge at project inception is inevitable and uncertainty regarding how to proceed is part of the challenge. This risk is manifested in the statistical variance in the estimate of the time or cost to complete. 

A commitment to assuming risk entailed by bringing innovation to the enterprise provides the opportunity to improve ROI.  

Another major difference between the business process of software delivery and other business processes is the diseconomy of scale. Typically, manufacturing and service delivery processes offer economy of scale: The cost of a unit of software grows nonlinearly (i.e., yields cost reduction) with the size and complexity of the system. But this is not the norm in software production.

On the other hand, some of the insights that have been discovered as part of IBM effort are trivial. For example, on page 15 they say 

Many organizations mistakenly try to make one process fit all circumstances. In our experience, the above type of analysis is required to enable you to drive the appropriate change to the right project types. 

I don’t know of any organization that doesn’t believe this. In fact, it sounds condescending. 

In essence, MCIF is a practice-based approach to software development processes. An approach they first introduced in the last version of EPF (before it became inactive). One can argue that IBM was a later comer to this also., The concept of practice has been widely utilized in CMMI, Microsoft MSF and EssentialUP.  MCIF is a methodology for top-down selection of practices based on the organization’s business objectives.

Although I like objective-based software development process definition, MCIF, however, is top-down and non-collaborative. It relies on Rational Method Composer (RMC) tool, which is a single-user desktop application–requiring a configuration management system for basic maintenance of processes. The white paper, also, falls short in addressing the practical issues of mapping business objectives to different aspects of processes and the mechanics of process tailoring. 

Finally, from Per’s video, it is apparent that MCIF is not a tool empowering users, rather it is a service that requires engagement of IBM consulting services.

My recommendation: best source for software development capabilities improvement is CMMI body of work. As I said before, CMMI is the result of two decades worth of work by various subject matter experts, not a single vendor’s commercial methodology.

ORM Deep Dive

Friday, June 5th, 2009

Our design team has been spending some time the last few weeks at taking another look at our architectural practices and also at standardizing on a core Web application architecture that we can then drop into the heart of any web application and then extend/wrap it based on the needs of the particular web application we may be working on.

Performance is one of the key factors in determining best practices and for data intensive applications, the database and data access layer performance in particular can make or break an application’s responsiveness more than any other layer in the application.

As we do a lot of ASP.Net/ .Net development, one of our designers Sanguanchai who is based in our Bangkok office has been trying to break down the use of LINQ to SQL versus using the Entity Framework 4 (currently in beta 1) across key considerations – not just performance - for use in our core architecture.

You can take a look at his findings in the post he published yesterday - LINQ to SQL vs Entity Framework. No doubt there are other architects and developers out there trying to get a handle on the same question and we hope you find this useful.

On a slightly related note, if you are the podcast listening type, .Net Rocks had Dan Simmons, Dev Manager on the EF4 and LINQ to SQL Team as a guest on the latest show. It’s good listen and you get some background on Microsoft’s decisions – including why EF is at version 4 though it is actually the 2nd release. You also get a view into the decisions behind certain features in EF4 and I personally got a better understanding of how best to leverage certain features of EF4 that were opaque to me thus far.

Again, the direct link to Sanguanchai’s post:

http://www.osellus.com/blogs/2009/06/04/linq-to-sql-vs-entity-framework/

Enjoy!

Interesting Work on Process Authoring Tools

Friday, June 5th, 2009

A colleague forwarded to me an interesting work by Petter Holmström titled “Ideas for Next Generation Process Authoring Tools”.  It’s a long comprehensive document, and I have just started reading it end-to-end. From a quick scan of the table of contents, abstract and conclusions, I mostly agree with his conclusions and recommendations:  

The tool vendors should shift focus and concentrate on making their tools more collaborative, customizable and scalable to different process sizes. In this thesis, some ideas of how this could be achieved have been presented, of which one of the more interesting ones is a wiki-based authoring tool.

 

As you may have realized from my previous blog postings, I am a strong proponent of collaborative process management tools and the importance of the involvement of developers and other process consumers in the creation of processes–they consume. The industry players and the user community should democratize process authoring and move on from blindly following methodology pundits.

LINQ to SQL vs Entity Framework

Thursday, June 4th, 2009

Compare features for LINQ to SQL vs Entity Framework

Feature LINQ to SQL Entity Framework
Model domain model conceptual data model
Databases Supported SQL server only variety of databases
Data Sources tables only tables, replication, reporting Services, BI and etc
Complexity simple to use complex to use
Development Time rapid development slower development but more capabilities
Mapping class to single table class to multiple tables
Inheritance hard to apply simple to apply
File Types dbml file only after compilation generate edmx file with 3 sections to represent the schema: csdl, msl and ssdl
Create complex properties (e.g. Customer type may have Address property that is an Address type with Street, City, Region, Country and Postal code properties) Not Support Support in VS2010 Beta 1 but we can manually modify in .edmx file.
Query 1. LINQ to SQL (for select)
2. Data Context (for update, create, delete, store procedure, view)
1. LINQ to Entities (for select)2. Entity SQL (is a derivative of Transact-SQL, it supports inheritance and associations) 

3. Object Services (for update, create, delete, store procedure, view)

4. Entity Client (is an ADO.NET managed provider, it is similar to SQLClient, OracleClient, etc. It provides several components like EntityCommand, EntityTransaction)

 

Can synchronize with Database if Database Schema is changed Not Support Support
Performance Very slow for the first query Very slow for the first query. But overall performance is better than LINQ to SQL (Please see the result in the attach file)
Continue to improve features in the future No Yes
Generate database from entity model Not Support It will support in VS2010 Beta1.

Limitation for Entity Framework in VS2008 SP1

1. It will throw error in runtime if we use Contains statement. For example,
from p in context.Yard_Projects
where p_projectIDs.Contains(p.ProjectID)
select p
2. It will not generate method that map to store procedure that are not return as entity type. (We need to Entity Client to execute store procedure.) For example,
Void DeleteProjects(@CSVProjectIDs)
3. We need to map every column of a table in the storage schema. If some columns are not mapped, it will compile error.
4. If we delete a type from diagram, it’s difficult to put it back. We may manually update in .edmx file or delete old one and generate new model.
5. You don’t get a lot of control over the storage schema at all. What you mostly see is the client schema and the mapping to the storage schema.
6. If you want create methods that are map to store procedure that has the result from multiple tables, we need to manually modify section “SSDL” in the .edmx file.

Performance Test

I tried to test on LINQ to SQL and Entity Framework, I found the overall performance Entity Framework is better than LINQ to SQL. I separated test cases into 2 test groups.
1. Test select/create/update/delete for single table. You can see my test result in the “TestForSingleTable” sheet.
2. Test select/create/update/delete for single table with many associations. You can see my test result in te “TestForManyAssociation” sheet.
You can see the performance report in the link, PerformanceReport

Recommendation

I think we should use Entity Framework than LINQ to SQL because many reasons as follows.
1. It is more flexible to mapping entity model to database. You can map one class to multiple tables, using inheritance.
2. It supports many queries, you can use LINQ to Entities, Entity SQL, Object Services and Entity Client.
3. When the database schema is changed, you can synchronize entity model from latest database.
4. It has better overall performance when compare with LINQ to SQL.
5. It will be improved many features in the VS2010. You will create complex data type, generate database from entity model.
6. It supports many database vender other than SQL Server.

But disadvantage of using Entity Framework may be complexity and development time.

Refer to my performance report, if we use Entity Framework to implement in the data layer, we should use the following patterns.testperformancelinqtosql_entityframework
1. Use LINQ to Entity for simple queries.
2. Use Entity Client or Object to execute Store Procedure for complex queries (many joins, groups) or queries that perform delete many entities with contains many associations.
3. Use Object Service to insert, update, delete single/multiple entities.