In git we trust
It was one of that discussions when people prefer leaving the chat room before they say something they would regret later. The Pandora’s box was opened by an innocent question on the recommended procedure to customize a contributed Moodle plugin. The discussion turned into a hot topic on what the current source code management system (CVS) gives to us and what we and our contributors could gain (and lose) from switching the Moodle CONTRIB repository to git.
Imagine you have clients who pay you for implementing a solution based on a contributed Moodle plugin. The quality of the Moodle contributed code varies and some sort of the code review and clean-up is strongly recommended before using it at production server. The contributed code may contain critical security holes and such plugin may easily become an open gate to your server with a big bright sign “Welcome Attackers!” So you have reviewed the code, fixed several obvious bugs and improved the features a bit as requested by your client. And – now what?
Of course you want to deploy your changes to the client’s server – as soon as it is tested (or at least it does not produce any PHP warnings
) And you, as a follower of the open source spirit, want to contribute your changes back to upstream. So you send your patches to the original component’s maintainer to be included in CVS. Sounds good. In a perfect world it would work with no problems. But the humans are not dead yet and things tend to get complicated. Also, some objections against the real benefits of this procedure were raised. Let me list some issues that may influence this contributing back to upstream.
- The maintainer does not respond. Such things happen. You can ask the CVS administrator to give you write access and become a maintainer. Do not do this if you know/feel you would not have time to do it responsibly. Remember the lesson #5 from the Eric Raymond’s famous essay: “When you lose interest in a program, your last duty to it is to hand it off to a competent successor.” It’s better if it’s known there is no maintainer than if people think they can rely on you. On the other hand, do it if a silent voice inside your mind whispers that it’s the job you always wanted to do. It’s both nice and good thing to spend your life on.
- The maintainer does not agree with your improvements. Then it’s time to discuss things. Summarize the issue and open the topic in Moodle forums. Listen to the maintainer’s arguments. Maybe he/she has some further plans with the module? Who else can profit from your changes? Try to not to fork the product unless there is a chance to reach a consensus.
- The rubbish can get into the code after you reviewed it. Yes, it can. You must remember the state of the code when it was reviewed. Source code management software calls such signs as “tags”. You shall tag a certain point of the source code history and review exactly that point. Then you know what version is sort of safe to be deployed. Also, you can focus on the changes that came after this point later when you review the code again.
- The maintainer does not understand your fixes. Some say that simply sending the patches to the maintainers is useless as they do not spend any effort on learning from it. I personally do not agree with this. However, I can see the importance of educating the contributors. If the contributed code contains a security bug which is exploited, who is blamed? The administrators? No, they have always been against this community thingie. The original author? No, nobody knows him/her. Moodle itself? Sure – it’s all Moodle’s fault! And that’s not fair.
All these have nothing to do with the SCM software itself. It is about responsibility, communication, constructive peer-reviews and developer etiquette. However, I think that the features and the characteristics of the SCM may significantly help with sorting out the issues discussed above or with not running into them at all.
Switching Moodle CONTRIB area into a cluster of separate git repositories emerged as an idea during the discussion. The nature of git – a distributed versioning system – seems to perfectly fit into the nature of our contrib, the hive of really interesting features and ideas. Contributors may come with their own trees of source code and the comparison of such two trees is easy. If the maintainer stops responding to the community questions and proposals, his/her tree becomes quickly obsolete and the community may start to prefer some other tree (if such is available). The pull-based model of the the maintainer-contributor interaction requires the maintainer pays attention to the patches to be included and hopefully learns from them. On the other hand, the contributors have to keep their trees up-to-date with the recent versions of the upstream code if they wish it is included in the maintainer’s tree. The point is, with git, everybody can easily become a maintainer of his/her own clone of the product. They have to gain the respect of the community members by keeping the high quality of their version. And remember – respect is everything.
I do not think we should switch to git-only at the moment. Let us give git a chance and see what Moodle contributors think. To help them with their first steps, I summarized my personal experiences with using git in Moodle development into a sort of tutorial at http://docs.moodle.org. You are welcome to review it and/or put comments with your own solutions.
October 26th, 2009 at 23:00
Hi,
hmmm, you are forgetting one important purpose of contrib cvs. When you start refactoring Moodle core or start developing new feature it is very handy to just grep contrib cvs checkout and immediately you see what you are about to break. You can also contact contrib devs and tell them what needs to be fixed. My favourite mantra is – “if it is not in contrib it does not exist”
This also applies to searching for potential security problems.
Now tell me how am I supposed to do this if the contrib code is spread over hundreds of branches in various git repos? Does git support this scenario?
November 13th, 2009 at 23:56
git itself can be used for such centralized repositories, of course (it is not question of SCM technology, it is about software maintenance processes). But I admit it might be more difficult to easily get a snapshot of the whole contrib as I suppose it would be just a heap of separate git repositories.