Fields in Core code sprint: what we did

Jan 11, 2009

A few reports have already been written about the "Fields in Core" coding sprint that was held December 15th - 19th at the Acquia offices in Andover, Massachusetts. While some of them are more technical, I decided to try to give you a good idea of what we spent our time doing. For the more technical ones check the "field in core" group.

Day 1: design and architectural decisions

It was Monday morning, and as we were entering the office complex, chx pointed out that the two offices occupied by Acquia corresponded to HTTP status messages. That's when I realized how much fun this week was going to be.

After a quick tour of the Acquia offices and a round of introduction with some of the staff, we gathered in the conference room and got to work. The team was comprised of Barry Jaspan, David Strauss, Dries Buytaert, Karen Stevenson, Karoly Negyesi, Moshe Weitzmann, Yves Chedemois and myself.

We started by defining some goals for the week as well as defining precise tasks and giving them priorities. The inclusion of CCK into core, and the many doors it opens, makes it easy for anyone to go on tangents; we could easily have planned features that would have kept all of busy for many months, but Barry did a great job of keeping us on track and progress moved efficiently.

Once the scope of our work was defined, we started tackling the design of the new "fields API". In essence it provides the same functionality as CCK, but with a more generic approach which doesn't depend on nodes, allowing us to attach fields to other entities such as users or remote data. In order for this to be possible, we had to clearly differentiate different levels of abstraction and find the appropriate terminology that would make the new concepts easy to understand.

Among other architectural decisions such as the use of defined classes for field structures, the most important one was probably related to data storage. The discussion started in the late morning, was carried on through lunch and continued in the afternoon with many points for and against the various proposed models. In the end we chose per-field storage which at first is a step backwards in terms of performance, but which also greatly simplifies the implementation and makes it much easier to denormalize data for more efficient solutions.

The end of the day was spent in separate groups. Barry, Karen and Yves worked on defining the data structures involved in the fields API. Moshe, chx and myself worked on building some helper functions for the tests as well as outlining the tests themselves. David worked on his proposed denormalization API, which we now refer to as "materialized view" (which is the correct technical term, but is not related to the views module).

Day 2: Breaking things

After a full day of theoretical discussion, everyone was happy to get their hands dirty and dig into the code. Barry had already done a good amount of work separating fields from field instances, and Yves had updated most of the code to Drupal 7 standards, but there was still a lot to do. Important structural changes, such as splitting cck into a core "field.module" and a contrib "cck.module", as well as updating the code to the improved naming conventions caused us to break a lot of code. As we started implementing the new concepts, more issues were raised and addressed.

Day 3: Building things

With the main structural changes in place, we started by defining the Field API more clearly, which also allowed us to write automated tests. We started with the internals and moved up in the abstraction layers, always working on getting a satisfying code coverage level from our tests before moving on to a different task. Writing tests in parallels to writing code allowed us to get instant code review and fix a lot of bugs on the spot, which made team work a lot more efficient.

During the whole week, we tried to keep the community up to date with our progress and key decisions. Some decisions, such as the choice to go with per-field storage, got a lot of heat from important contributors. However, addressing these important concerns was also our duty, and a few of us had to spend time defending concepts in IRC rather than writing code.

Day 4: Consolidating

The work from the previous day continued with an increasing code stability as the number of passing tests grew. David Rothstein joined us for the day and contributed by writing some key elements to the deletion of field and instances.

We first got the API in place to attach field values to any generic entity programmatically, which was important in getting tests that covered all the requirements, but the most exciting moment of the day was probably when we all gathered behind Karen and Yves' laptops to stare at a textfield displayed on a user edit page!

Day 5: Escaping the storm

The weather forecast announced a snow storm for Friday afternoon. While it wouldn't have kept anyone from coming to the office, it certainly would have kept us from leaving and catching our respective flights in the evening or the next morning, so we decided to work remotely, with some of us meeting in Boston.


In conclusion, I think that this code sprint was a big success, both in terms of the results we achieved and in terms of our experience as a diverse team of developers coming together to work on a major contribution to the future of Drupal. This success has motivated the organization of more code sprints in the near future.

For more information, come to one of the "Fields in Core" sessions at DrupalCamp San Diego or DrupalCamp Germany. In addition to the informative session at DrupalCon DC, there will also be a more advanced session with the possibility to discuss important decisions with other members of the code sprint team.

Note: this report will also appear on the blog of my employer, Achieve Internet, who sponsored my time during the code sprint.