Thursday, December 31, 2009

Continuous Integration of Python Code with Unit Tests and Maven

My main development language is Java, but I also some work in Python for deployment and related tools. Being a big fan of unit testing I write unit tests in Python using PyUnit. Being a big fan of Maven and Continuous Integration, I really want the  Python unit tests to run as part of the build. I wanted to have a solution that met the following criteria:
  • Used commonly available plugins
  • Keep the maven structure of test and src files in the appropriate directories.
  • Have the tests run in the test phase and fail the build when the tests fail.

The simplest approach I came up with to do this was to use the Exec Maven Plugin by adding the following configuration to your (python) project's POM.

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>exec-maven-plugin</artifactId>
  <executions>
   <execution>
    <configuration>
     <executable>python</executable>
     <workingDirectory>src/test/python</workingDirectory>
     <arguments>
      <argument>unitTests.py</argument>
     </arguments>    
     <environmentVariables>
       <PYTHONPATH>../../main/python:$PYTHONPATH</PYTHONPATH>
     </environmentVariables>
    </configuration>
    <id>python-test</id>
    <phase>test</phase>
    <goals>
     <goal>exec</goal>
    </goals>
   </execution>
  </executions>
 </plugin>

This works well enough. Setting the PYTHONPATH environment variable allows your pyUnit tests to find the modules you are building in the project. What's less than ideal is that, unlike other maven plugins, the person running the build needs to have python installed and configured correctly. (You can allow for some variations between environments. And if you have a developer on your project who doesn't use python, and doesn't want to there is a property you can set on the exec plugin to skip the tests, so in the end only those who use python, and the continuous integration server, need the correct things installed.

This may be obvious to some, if not many, but in case anyone is looking for an answer to how to run unit tests as part of of your maven build, I hope that this is helpful.

Thursday, December 17, 2009

Versions in Maven and Source

I'm a big fan of Maven, a build (and project) management tool. When you are working with Maven, each artifact  that you develop (jar, or war file for example) has a version that's distinct from version in your SCM system.  The Maven Book has a good discussion about how versions are managed, but some there are often questions on projects about how to use Maven versions when you also have the SCM version. "SCM Version 6453" gives more information that "Artifact version 1.1" for example. yet we have 2 version numbers to manage. Here's one approach that works well for simple projects.

I'm assuming that you know the basics of how to specify dependencies in Maven. If you need a quick intro to Maven, see the Maven Book.

If you are developing a project that has artifacts that external clients will incorporate with Maven, you need to change the artifact version with each release, as you  specify the artifact version in the dependency element in your pom. If you are just using maven to develop components that will never show up in an external repository, adding the artifact version dimension to the SCM version can appear to add little value at the expense of some overhead, unless you have a clear model of how the two interact.

Many teams using maven are using it to manage external dependencies, but their own artifacts are not published to external maven repositories. A simple webapp, for example may have one war file, and have some external dependencies, and for support and validation purposes, you can use the build number or version number. There are a couple of ways to figure out the source version that you built your webapp from.

If you use a continuous integration tool like Bamboo, you can pass in the build number on the build command line by adding -DbuildNumber=${bamboo.buildNumber} to make the property buildNumber available to your project. You can make this available the application, so that you can see the build number on a login screen or about page. From the build number you can infer the SCM version. Or you can often ask the CI system the SCM version number. For exampe, in Bamboo, you can specify -DscmRevision=${custom.svn.revision.number} if you are using subversion as your SCM.

You can also use the build-number maven plugin to embed the version in your war manifest or make it available to your flex application, for example.

Regardless of how you get the revision number the revision number still gives you enough information to identify what code you are working with. The Maven convention to keep in mind is that SNAPSHOT artifacts are for active development, and non-SNAPSHOT versions are for released artifacts, so while you are working on a new projects, you'd be building a 1.0-SNAPSHOT artifact, and when you are ready to create a release branch, you would change it to 1.0.

So when should you change the project version in maven?  Here is a simple approach: One maven major/minor version per codeline. Incremental versions stay on the same codeline.

So if you are using an active development line pattern (see Software Configuration Management Patterns) this means that each Release Line has a maven version, and your Mainline is use to create the SNAPSHOT version of the next release artifact.

For example:
  • Start mainline development with 1.0-SNAPSHOT version
  • When you have a release:
    • create the release branch and change the artifact to 1.0.
    • Change the mainline version to 1.1-SNAPSHOT (or whatever the next active version is)
To do support of the 1.0 release you have 3 choices:
  1. Just work with the 1.1 artifact version and use build numbers to distinguish between the current release and the support release.
  2. Branch off of the 1.1 codeline, change the version to 1.1.1-SNAPSHOT, and merge back when the 1.1.1 release goes out.
  3. Change the version on the support branch to 1.1.1-SNAPSHOT and don't branch. Change the version to 1.1.1 when you are done, and create a tag for the release.
I prefer the third option as the simplest, and best option, assuming that there is only one stream of development work for the maintenance branch. It balances the need for identification, with the desire to minimize overhead.


To summarize:
  • New development for M.N-SNAPSHOT versions on trunk.Use SCM version & Build numbers to identify components internally.
  • M.N versions on their own branches.
  • Use M.N.n versions for maintenance releases; SNAPSHOTS for development, remove the SNAPSHOT when done. 
I'm interested in hearing what others may have done to reconcile SCM and Maven artifact versions for internal components.

Sunday, December 13, 2009

Uncertainty and Agile Requirements

The key value of Agile methods is to help you to manage uncertainty. By being incremental and iterative, you manage risks by not investing a lot of effort in specifying things that may "wrong." At the start of each iteration you can look at what you have and decide that it's the right thing, in which case you can build on it, or the wrong thing, in which case you can try something else. Since you've only invested a small amount of effort relative to the specification you do in a waterfall process, you've wasted less effort and, in the end, money if you are wrong.

This approach of small stories with only some details works really well in many cases. An agile team runs into trouble when the project team confuses "uncertainty" with "vagueness." To be successful, an agile team needs to work off of a backlog that has stories that are precise enough that the team can iterate effectively with the stakeholders at the end of each iteration, and can develop with a high velocity. It's important to add precision even if you have uncertainties. While it's important to be as accurate as possible, don't use your lack of certainty about a requirement as an excuse to accept a lack of precision. When you have a good target to aim for (and you hit it) you can iterate quickly and judge if you are hitting the right targets.

How do you tell that you have enough precision?  This varies from team to team. For a team that has been together for a time and has a clear shared vision, a very brief statement of goals might well be enough. For a project where the vision is less clear,  a longer conversation may be necessary. Three concrete tests are:
  • Can the team estimate a story? (See Agile Estimating and Planning for more about estimating.) If the answer is "there is not enough information to estimate" then the story is too vague, and the team and the product owner need to meet to make sure that they understand the options. If the team estimates a story that the Product Owner thought was simple at 3 weeks, you have a raised a flag that you need more conversation to understand what the PO really wants.
  • Can you provide three options for how to implement the the story, or 3 variation of what the user experience will be? If you find yourself developing many more that seem plausable, the story is too vague, and if you can only develop one or two, then the there is not enough information for you to think through the story. 
  • Can you test the story? If you can't come up with a a reasonable high-level test plan, then  the story is too vague. (Mike Cohn has written an excellent article about the value of planning with the larger team.)
While being able to do all 3 for a story is nice, being able to feel like you can estimate with confidence is the one thing that you should do to feel confident that the stories are well developed.  If you can't estimate based on what you know about the story, the good news is that  the very act of trying to come up with an estimate, options, or a test plan will help you refine the story.

One might say that this is too much planning for an agile process, and that this level of detail sounds kind of like a "waterfall."  And at a high level it seems related to the Cone Of Uncertainty, which is a model for waterfall development. The difference is that we still don't need or want to have fully defined specifications at the start of the project; as we approach a development iteration, we want enough detail to have a development target that a stakeholder can evaluate.

At the end of an iteration when something isn't quite right, you want your stakeholders to say "that's not what I want" rather than argue over what they meant. The latter will still happen, and it's OK when it does. By being precise about what you think you want to build, you will identify the high risk areas of a project early on, so that you can take full advantage of the risk management benefits of agile.

Site Reliability Engineering; The Book and The Practices

Site Reliability Engineering It’s difficult to walk into a software development organization without hearing about the discipline of Site ...