DataworksSummit Berlin - Wednesday morning

2018-04-19 06:50
Data strategy - cloud strategy - business strategy: Aligning the three was one of the main themes (initially put forward in his opening keynote by CTO of Hortonworks Scott Gnau) thoughout this weeks Dataworks Summit Berlin kindly organised and hosted by Hortonworks. The event was attended by over 1000 attendees joining from 51 countries.

The inspiration hat was put forward in the first keynote by Scott was to take a closer look at the data lifecycle - including the fact that a lot of data is being created (and made available) outside the control of those using it: Smart farming users are using a combination of weather data, information on soil conditions gathered through sensors out in the field in order to inform daily decisions. Manufacturing is moving towards closer monitoring of production lines to spot inefficiencies. Cities are starting to deploy systems that allow for better integration of public services. UX is being optimized through extensive automation.

When it comes to moving data to the cloud, the speaker gave a nice comparison: To him, explaining the difficulties that moving to the cloud brings is similar to the challenges that moving "stuff" to external storage in the garage brings: It opens questions of "Where did I put this thing?", but also about access control, security. Much the same way, cloud and on-prem integration means that questions like encryption, authorization, user tracking, data governance need to be answered. But also questions like findability, discoverability and integration for analysis purposes.

The second keynote was given by Mandy Chessell from IBM introducing Apache Atlas for metadata integration and governance.

In the third keynote, Bernard Marr talked about the five promises of big data:

  • Informing decisions based on data: The goal here should be to move towards self service platforms to remove the "we need a data scientist for that" bottleneck. That in turn needs quite some training and hand-holding for those interested in the self-service platforms.
  • Understanding customers and customer trends better: The example given was a butcher shop that would install a mobile phone tracker in his shop window in order to see which advertisement would make more people stop by and look closer. As a side effect he noticed an increase in people on the street in the middle of the night (coming from pubs nearby). A decision was made to open at that time, offer what people were searching for at that time according to Google trends - by now that one hour in the night makes a sizeable portion of the shop's income. The second example given was Disney already watching all it's Disney park visitors through wrist bands, automating line management at popular attractions - but also deploying facial recognition watching audiences watch shows in figure out how well those shows are received.
  • Improve the customer value proposition: The example given was the Royal Bank of Scotland moving closer to it's clients, informing them through automated means when interest rates are dropping, or when they are double insured - thus building trust and transparency. The other example given was that of a lift company building sensors into lifts in order to be able to predict failures and repair lifts when they are least used.
  • Automate business processes: Here the example was that of a car insurance that would offer dynamic rates if people would let themselves monitor during driving. Those adhering to speed limits, avoiding risky routes and times would get lower rates. Another example was that of automating the creation of sports reports e.g. for tennis matches based on sensors deployed, or that of automating Forbes analyst reports some of which get published without the involvement of a journalist.
  • Last but not least the speaker mentioned the obvious business case of selling data assets - e.g. selling aggregated and refined data gathered through sensors in the field back to farmers. Another example was the automatic detection of events based on sounds detected - e.g. gun shots close to public squares and selling that back to the police.

After the keynotes were over breakout sessions started - including my talk about the Apache Way. It was good to see people show up to learn how all the open source big data projects are working behind the scences - and how they themselves can get involved in contributing and shaping these projects. I'm looking forward to receiving pictures of feather shaped cookies.

During lunch there was time to listen in on how Santander operations is using data analytics to drive incident detection, as well as load prediction for capacity planning.

After lunch I had time for two more talks: The first explained how to integrate Apache MxNet with Apache NiFi to bring machine learning to the edge. The second one introduced Apache Beam - an abstraction layer above Apache Flink, Spark and Google's platform.

Both, scary and funny: Walking up to the Apache Beam speaker after his talk (having learnt at DataworksSummit that he is PMC Chair of Apache Beam) - only to be greeted with "I know who *you* are" before even getting to introduce oneself...

Apache Breakfast

2018-04-17 07:39

In case you missed it but are living in Berlin - or are visiting Berlin/ Germany this week: A handful of Apache people (committers/ members) are meeting over breakfast on Friday morning this week. If you are interested in joining, please let me know (or check yourself - in the archives of the mailing list

FOSS Backstage - Schedule online

2018-04-17 07:27
In January the CfP for FOSS Backstage opened. By now reviews have been done, speakers notified and a schedule created.

I'm delighted to find both - a lot of friends from the Apache Software Foundation but also a great many speakers that aren't affiliated with the ASF among the speakers.

If you want to know how Open Source really works, if you want to get a glimpse behind the stage, do not wait for too long to grab your ticket now and join us in summer in Berlin/ Germany.

If project management is only partially of your interest, we have you covered as well: For those interested in storing, searching and scaling data analysis, Berlin Buzzwords is scheduled to take place in the same week. For those interested in Tomcat, httpd, cloud and iot, Apache Roadshow is scheduled to happen on the same days as FOSS Backstage - and your FOSS Backstage ticket grants you access to Apache Roadshow as well.

If you're still not convinced - head over to the conference website and check out the talks available yourself.

My board nomination statement 2018

2018-03-23 07:21
Two days ago the Apache Software Foundation members meeting started. One of the outcomes of each members meeting is an elected board of directors. The way that works is explained here: Annual Apache members meeting. As explained in the linked post, members accepting their nomination to become a

director are supposed to provide a nomination statement. This year they were also asked to answer a set of questions so members could better decide who to vote for.

As one of my favourite pet peeves is to make the inner workings of the foundation more transparent to outsiders (and have said so in the nomination statement) - I would like to start by publishing my own nomination statement here for others to read who don't have access to our internal communication channels:

Board statement:

Two years ago I was put on a roller coaster by being nominated as Apache board member which subsequently meant I got to serve on the board in 2016. Little did I know what kind of questions were waiting for me.

Much like back then I won't treat this position statement as a voting campaign. I don't claim to have answers to all the questions we face as we grow larger - however I believe being a board member even at our size should be something that is fun. Something that is lightweight enough so people don't outright decline their nominations just for lack of time.

One thing I learnt the hard way is scalability needs two major ingredients: Breaking dependencies and distribution of workload. Call me old-fashioned (even though chemistry can hide my gray hair, my preference for mutt as a mail client betrays my age), but I believe we already have some of the core values to achieve just that:
  • "Community over code" to me includes rewarding contributions that aren't code. I believe it is important to get people into the foundation that are committed to both our projects as well as the foundation itself - helping us in all sorts of ways, including but not limited to coding, documenting, marketing, mentoring, legal, education and more.
  • "What didn't happen on the mailing list didn't happen" to me means communicating as publicly as possible (while keeping privacy as needed) to enable others to better understand where we are, how we work, what we value and ultimately how to help us. I would like for us to think twice before sending information to private lists - both at the project and at the operational level.
  • I believe we can do better in getting those into the loop who have a vested interest in seeing that our projects are run in a vendor neutral way: Our downstream users who rely on Apache projects for their daily work.
I am married to a Linux kernel geek working for the Amazon kernel and operating systems team - I've learnt a long time ago that the Open Source world is bigger than just one project, bigger than just one foundation. Expect me to keep the bigger picture in mind during my work here that is not ASF exclusive.

Much like Bertrand I'm a European - that means I do see value in time spent offline, in being disconnected. I would like to urge others to take that liberty as well - if not for yourselves, then at least to highlight where we are still lacking in terms of number of people that can take care of a vital role.

As you may have guessed from the time it took for me to accept this nomination, I didn't take the decision lightly. For starters semi-regularly following the discussion on board@ to me feels like there are people way more capable than myself. Seeing just how active people are feels like my time budget is way too limited.

So what made me accept? I consider myself lucky seeing people nominated for the Apache board who are capable leaders that bring very diverse skills, capabilities and knowledge with them that taken together will make an awesome board of directors.

I know that with FOSS Backstage one other "pet project of mine" is in capable hands, so I don't need to be involved in it on a day-to-day basis.

Last but not least I haven't forgotten that back in autumn 2016 Lars Trieloff* told me that I am a role model: Being an ASF director, while still working in tech, with a today three year old at home. As the saying goes "Wege entstehen dadurch, dass man sie geht" - free-form translation: "paths are created by walking them." So instead of pre-emptively declining my nomination I would like to find a way to make the role of being a Director at the Apache Software Foundation something that is manageable for a volunteer. Maybe along that way we'll find a piece in the puzzle to the question of who watches the watchmen - how do we reduce the number of volunteers that we burn through, operating at a sustainable level, enabling people outside of the board of directors to take over or help with tasks.

* Whom I know through the Apache Dinner/ Lunch Berlin that I used to organise what feels like ages ago. We should totally re-instate that again now that there are so many ASF affiliated people in or close to Berlin. Any volunteers? The one who organises gets to choose date and location after all ;)

Answers to questions to the board nominees:

On Thu, Mar 15, 2018 at 01:57:07PM +0100, Daniel Gruno wrote:
> Missions, Visions...and Decisions:
> - The ASF exists with a primary goal of "providing open source
> software to the public, at no charge". What do you consider to be
> the foundation's most important secondary (implicit) goal?

I learnt a lot about what is valuable to us in the following discussion:

(and the following public thread over on dev@community with the same subject. My main take-away from there came from Bertrand: The value we are giving back to projects is by providing "A neutral space where they can operate according to our well established best practices."

The second learning I had just recently when I had the chance of thinking through some of the values that are encoded in our Bylaws that you do not find in those of other organisations: At the ASF you pay for influence with time (someone I respect a lot extended that by stating that you actually pay with time and love).

> - Looking ahead, 5 years, 10 years...what do you hope the biggest
> change (that you can conceivably contribute to) to the foundation
> will be, if any? What are your greatest concerns?

One year ago I had no idea that little over two months from now we would have something like FOSS Backstage here in Berlin: One thing the ASF has taught me is that predicting the future is futile - the community as a whole will make changes in this world that are way bigger than the individual contributions taken together.

> < - Which aspect(s) (if any) of the way the ASF operates today are you > least satisfied with? What would you do to change it?

Those are in my position statement already.

> #######################################

> Budget and Operations:
> - Which roles do you envision moving towards paid roles. Is this the
> right move, and if not, what can we do to prevent/delay this?

Honestly I cannot judge what's right and wrong here. I do know that burning through volunteers to me is not an option. What I would like to hear from you as a member is what you would need to step up and do operational tasks at the ASF.

Some random thoughts: - Do we have the right people in our membership that can fill these operational roles? Are we doing a good enough job in bringing people in with all sorts of backgrounds, who have done all sorts of types of contributions? - Are we doing a good enough job at making transparent where the foundation needs operational help? Are those roles small enough to be filled by one individual?

This question could be read like today work at the ASF is not paid for. This is far from true - both at the project as well as at the operational level. What I think we need is collective understanding of what the implications of various funding models are: Even if the ASF doesn't accept payment for development doesn't directly imply that projects are more independent as a result. I would assume the same to be true at the operational level.

> #######################################
> Membership and Governance:
> - Should the membership play a more prominent role in
> decision-making at the ASF? If so, where do you propose this be?

I may be naive but I still believe in the "those who do the work are those who take decisions". There only close to a dozen people who participated in the "ask the members questionaire" I sent around - something that was troubling for me to see was how pretty much everyone wanted

> - What would be your take on the cohesion of the ASF, the PMCs, the
> membership and the communities. Are we one big happy family, or
> just a bunch of silos? Where do you see it heading, and where do
> we need to take action, if anywhere?

If "one big happy family" conjures the picture of people with smiling faces only, than that is a very cheesy image of a family that in my experience doesn't reflect reality of what families typically look like.

This year at FOSDEM in Brussels we had a dinner table of maybe 15 people (while I did book the table, I don't remember the exact number - over-provisioning and a bit of improvisation helped a lot in making things scale) from various projects, who joined at various times. I do remember a lot of laughter at that table. If anything I think we need the help people to bump into each other face to face independently of their respective project community more often.

> - If you were in charge of overall community development (sorry,
> Sharan!), what would you focus on as your primary and secondary
> goal? How would you implement what you think is needed to achieve
> this?

I'm not in charge in that - nor would I want to be, nor should I be. The value I see in the ASF is that we rely very heavily on self organisation, so this foundation is what each individual in it makes out of it - and to me those individuals aren't limited to foundation members, PMC members or even committers. In each Apache Way talk I've seen (and everytime I explain the Apache Way to people) the explanation starts with our projects' downstream users.

> Show and Tell:

I'm not much of a show and tell person. At ApacheCon Oakland I once was seeking help with getting a press article about ApacheCon reviewed. It was easy finding a volunteer to proof-read the article. The reason for that ease given by the volunteer themselves? What they got out of their contributions to the ASF was much bigger than anything they put into it. That observation holds true for me as well - and I do hope that this is true for everyone here who is even mildly active.

An argument against proxies

2018-03-08 17:53
Proxies? In companies getting started with an upstream first concept this is what people are called who act as the only interface between their employer and an open source project: All information from any project used internally flows through them. All bug reports and patches intended as upstream contribution also flows through them - hiding entire teams producing the actual contributions.

At Apache projects I learnt to dislike this setup of having proxies act in place of the real contributors. Why so?

Apache is built on the premise of individuals working together in the best interest of their projects. Over time, people who prove to commit themselves to a project get added to that project. Work contributed to a project gets rewarded - in a merit doesn't go away kind-of sense working on an Apache project is a role independent of other work committments - in the "merit doesn't go away" sense this merit is attached to the individual making contributions, not to the entity sponsoring that individual in one way or another.

This mechanism does not work anymore if proxy committers act as gateway between employers and the open source world: While proxied employees are saved from the tax that working in the public brings by being hidden behind proxies, they will also never be able to accrue the same amount of merit with the project itself. They will not be rewarded by the project for their committment. Their contributions do not end up being attached to themselves as individuals.

From the perspective of those watching how much people contribute to open source projects the concept of proxy committers often is neither transparent nor clear. For them proxies establish a false sense of hyper productivity: The work done by many sails under the flag of one individual, potentially discouraging others with less time from participating: "I will never be able to devote that much work to that project, so why even start?"

From an employer point of view proxies turn into single point of failure roles: Once that person is gone (on vacation, to take care of a relative, found a new job) they take the bonds they made in the open source project with them - including any street cred they may have gathered.

Last but not least I believe in order to discuss a specific open source contribution the participants need a solid understanding of the project itself. Something only people in the trenches can acquire.

As a result you'll see me try and pull those actually working with a certain project to get active and involved themselves, to dedicate time to the core technology they rely on on a daily basis, to realise that working on these projects gives you a broader perspective beyond just your day job.

FOSDEM 2018 - recap

2018-02-13 06:13
Too crowded, too many queues, too little space - but also lots of friendly people, Belgian waffles, ice cream, an ASF dinner with grey beards and new people, a busy ASF booth, bumping into friends every few steps, meeting humans you see only online for an entire year or more: For me, that's the gist of this year's FOSDEM.

Note: German version of the article including images appeared in my employer's tech blog.

To my knowledge FOSDEM is the biggest gathering of free software people in Europe at least. It's free of charge, kindly hosted by ULB, organised by a large group of volunteers. Every year early February the FOSS community meets for two one weekend in Brussels to discuss all sorts of aspects of Free and Open Source Software Development - including community, legal, business and policy aspects. The event features more than 600 talks as well as several dozen booths by FOSS projects and FOSS friendly companies. There's several FOSDEM fringe events surrounding the event that are not located on campus. If you go to any random bar or restaurant in Brussels that weekend you are bound to bump into FOSDEM people.

Fortunately for those not lucky enough to have made it to the event, video recordings (unfortunately in varying quality) are available online at Some highlights you might want to watch:

One highlight for me personally this year: I cannot help but believe that I met way more faces from The Apache Software Foundation than at any other FOSDEM before. The booth was crowded at all times - Sharan Foga did a great job explaining The ASF to people. Also it's great to hear The ASF mentioned in several talks as one of the initiatives to look at to understand how to run open source projects in a sustainable fashion with an eye on longevity. It was helpful to have at least two current Apache board members (Bertrand Delacretaz as well as Rich Bowen) on site to help answer tricky questions. Last but not least it was lovely meeting several of the Apache Grey Beards (TM) for an Apache Dinner on Saturday evening. Luckily co-located with the FOSDEM HPC speaker dinner - which took a calendar conflict out of the Apache HPC people's calendar :)

Me personally, I hope to see many more ASF people later this year in Berlin for FOSS Backstage - the advertisement sign that was located at the FOSDEM ASF booth last weekend already made it here, will you follow?

FOSS Backstage - CfP open

2018-01-23 16:21
It's almost ten years ago that I attended my first ApacheCon EU in Amsterdam. I wasn't entirely new to the topic of open source or free software. I attended several talks on Apache Lucene, Apache Solr, Hadoop, Tomcat, httpd (I still remember that the most impressive stories didn't necessarily come from the project members, but from downstream users. They were the ones authorized to talk publicly about what could be done with the project - and often became committers themselves down the road.

With "community over code" being one of the main values at Apache, ApacheCon also hosted several non-technical tracks: Open source and business, Open Development (nowadays better known as Inner Source), Open Source project management, project governance, an Apache Way talk. Over the past decade one learning survived any wave of tech buzzword: At the end of the day, success in Open Source (much like in any project) is defined by how well the project is run (read: managed). Reflecting on that the idea was born to create a space to discuss just these topics: What does it take to be "Leading the wave of open source"?

As announced on Berlin Buzzwords we (that is Isabel Drost-Fromm, Stefan Rudnitzki as well as the eventing team over at newthinking communications GmbH) are working on a new conference in summer in Berlin. The name of this new conference will be "FOSS Backstage". Backstage comprises all things FOSS governance, open collaboration and how to build and manage communities within the open source space.

Submission URL: Call for Presentations

The event will comprise presentations on all things FOSS governance, decentralised decision making, open collaboration. We invite you to submit talks on the topics: FOSS project governance, collaboration, community management. Asynchronous/ decentralised decision making. Vendor neutrality in FOSS, sustainable FOSS, cross team collaboration. Dealing with poisonous people. Project growth and hand-over. Trademarks. Strategic licensing. While it's primarily targeted at contributions from FOSS people, we would love to also learn more on how typical FOSS collaboration models work well within enterprises. Closely related topics not explicitly listed above are welcome.

Important Dates (all dates in GMT +2)

Submission deadline: February 18th, 2018.

Conference: June, 13th/14th, 2018

High quality talks are called for, ranging from principles to practice. We are looking for real world case studies, background on the social architecture of specific projects and a deep dive into cross community collaboration. Acceptance notifications will be sent out soon after the submission deadline. Please include your name, bio and email, the title of the talk, a brief abstract in English language.

We have drafted the submission form to allow for regular talks, each 45 min in length. However you are free to submit your own ideas on how to support the event: If you would like to take our attendees out to show them your favourite bar in Berlin, please submit this offer through the CfP form. If you are interested in sponsoring the event (e.g. we would be happy to provide videos after the event, free drinks for attendees as well as an after-show party), please contact us.

Schedule and further updates on the event will be published soon on the event web page.

Please re-distribute this CfP to people who might be interested.

Contact us at:
newthinking communications GmbH
Schoenhauser Allee 6/7
10119 Berlin, Germany

Looking forward to meeting you all in person in summer :)

Trust and confidence

2017-12-06 05:48
One of the main principles at Apache (as in The Apache Software Foundation) is "Community over Code" - having the goal to build projects that survive single community members loosing interest or time to contribute.

In his book "Producing Open Source Software" Karl Fogel describes this model of development as Consensus-based Democracy (in contrast to benevolent dictatorship): "Consensus simply means an agreement that everyone is willing to live with. It is not an ambiguous state: a group has reached consensus on a given question when someone proposes that consensus has been reached and no one contradicts the assertion. The person proposing consensus should, of course, state specifically what the consensus is, and what actions would be taken in consequence of it, if those are not obvious."

What that means is that not only one person can take decisions but pretty much anyone can declare a final decision was made. It also means decisions can be stopped by individuals on the project.

This model of development works well if what you want for your project is resilience to people, in particular those high up in the ranks, leaving at the cost of nobody having complete control. It means you are moving slower, at the benefit of getting more people on board and carrying on with your mission after you leave.

There are a couple implications to this goal: If for whatever reason one single entity needs to retain control over the project, you better not enter the incubator like suggested here. Balancing control and longevity is particularly tricky if you or your company believes they need to own the roadmap of the project. It's also tricky if your intuitive reaction to hiring a new engineer is to give them committership to the project on their first day - think again keeping in mind that Money can't buy love. If you're still convinced they should be made committer, Apache probably isn't the right place for your project.

Once you go through the process of giving up control with the help from your mentors you will learn to trust others - trust others to pick up tasks you leave open, trust others they are taking the right decision even if you would have done things otherwise, trust others to come up with solutions where you are lost. Essentially like Sharan Foga said to Trust the water.

Even coming to the project at a later stage as an individual contributor you'll go through the same learning experience: You'll learn to trust others with the patch you wrote. You'll have to learn to trust others to take your bug report seriously. If the project is well run, people will treat you as an equal peer, with respect and with appreciation. They'll likely treat you as part of the development team with as many decisions as possible - after all that's what these people want to recruit you for: For a position as volunteer in their project. Doing that means starting to Delegate like a Pro as Deb Nicholson once explained at ApacheCon. It also means training your capability for Empathy like Leslie Hawthorn explained at FOSDEM. It also means treating all contributions alike.

There's one pre-requesite to all of this working out though: Working in the open (as in "will be crawled, indexed and made visible by the major search engine of the day"), giving control to others over your baby project and potentially over what earns your daily living means you need a lot of trust not onnly in others but also in yourself. If you're in a position where you're afraid that missteps will have negative repercussions on your daily life you won't become comfortable with all of that. For projects coming to the incubator as well as companies paying contributors to become open source developers in their projects in my personal view that's an important lesson: Unless committers already feel self confident and independent enough of your organisation as well as the team they are part of to take decisions on their own, you will run into trouble walking towards at least Apache.

Open Source Summit - Day 3

2017-10-29 08:35
Open source summit Wednesday started with a keynote by members of the Banks family telling a packed room on how they approached raising a tech family. The first hurdle that Keila (the teenage daughter of the family) talked about was something I personally had never actually thought about: Communication tools like Slack that are in widespread use come with an age restriction excluding minors. So by trying to communicate with open source projects means entering illegality.

A bit more obivious was their advise to help raise kids' engagement with tech: Try to find topics that they can relate to. What works fairly often are reverse engineering projects that explain how things actually work.

The Banks are working with a goal based model where children get ten goals to pursue during the year with regular quarterly reviews. An intersting twist though: Eight of these ten goals are choosen by the children themselves, two are reserved for parents to help with guidance. As obvious as this may seem, having clear goals and being able to influence them yourselves is something that I believe is applicable in the wider context of open source contributor and project mentoring as well as employee engagement.

The speakers also talked about embracing children's fear. Keila told the story of how she was afraid to talk in front of adult audiences - in particular at the keynote level. The advise that her father gave that did help her: You can trip on the stage, you can fall, all of that doesn't matter for as long as you can laugh at yourself. Also remember that every project is not the perfect project - there's always something you can improve - and that's ok. This is fairly in line with the feedback given a day earlier during the Linux Kernel Panel where people mentioned how today they would never accept the first patch they themselves had once written: Be persistant, learn from the feedback you get and seek feedback early.

Last but not least, the speakers advised to not compare your family to anyone, not even to yourself. Everyone arrives at tech via a different route. It can be hard to get people from being averse to tech to embrace it - start with a tiny little bit of motivation, from there on rely on self motivation.

The family's current project turned business is to support L.A. schools to support children get a handle on tech.

The TAO of Hashicorp

In the second keynote Hashimoto gave an overview of the Tao of Hashicorp - essentially the values and principles the company is built on. What I found interesting about the talk was the fact that these values were written down very early in the process of building up Hashicorp when the company didn't have much more than five employees, comprised vision, roadmap and product design pieces and has been applied to every day decisions ever since.

The principles themselves cover the following points:
  • Workflows - not technologies. Essentially describing a UX first approach where tools are being mocked and used first before diving deeper into the architecture and coding. This goes as far as building a bash script as a mockup for a command line interface to see if it works well before diving into coding.
  • Simple, modular and Comosable. Meaning that tools built should have one clear purpose instead of piling features on top of each other for one product.
  • Communicating sequential processes. Meaning to have standalone tools with clear APIs.
  • Immutability.
  • Versioning through Codification. When having a question, the answer "just talk to X" doesn't scale as companies grow. There are several fixes to this problem. The one that Hashicorp decided to go for was to write knowledge down in code - instead of having a detailing how startup works, have something people can execute.
  • Automate.
  • Resilient systems. Meaning to strive for systems that know their desired state and have means to go back to it.
  • Pragmatism. Meaning that the principles above shouldn't be applied blindly but adjusted to the problem at hand.

While the content itself differs I find it interesting that Hashicorp decided to communicate in terms of their principles and values. This kind of setup reminds me quite a bit about the way Amazon Leadership principles are being applied and used inside of Amazon.

Integrating OSS in industrial environments - by Siemens

The third keynote was given by Siemens, a 170 year old, 350k employees rich German corporation focussed on industrial appliances.

In their current projects they are using OSS in embedded projects related to power generation, rail automation (Debian), vehicle control, building automation (Yocto), medical imaging (xenomai on big machines).

Their reason for tapping into OSS more and more is to grow beyond their own capabilities.

A challenge in their applications relates to long term stability, meaning supporiting an appliance for 50 years and longer. Running there appliances unmodified for years today is not feasible anymore due to policies and corporate standards that requrire updates in the field.

Trouble they are dealing with today is in the cost of software forks - both, self inflicted and supplier caused forks. The amount of cost attached to these is one of the reasons for Siemens to think upstream-first, both internally as well as when choosing suppliers.

Another reason for this approach is to be found in trying to become part of the community for three reasons: Keeping talent. Learning best practices from upstream instead of failing one-self. Better communication with suppliers through official open source channels.

One project Siemens is involved with at the moment is the so-called Civil Infrastructure Platform project.

Another huge topic within Siemens is software license compliance. Being a huge corporation they rely on Fossology for compliance checking.

Linus Torvalds Q&A

The last keynote of the day was an on stage interview with Linus Torvalds. The introduction to this kind of format was lovely: There's one thing Linus doesn't like: Being on stage and giving a pre-created talk. Giving his keynote in the form of an interview with questions not shared prior to the actual event meant that the interviewer would have to prep the actual content. :)

The first question asked was fairly technical: Are RCs slowing down? The reason that Linus gave had a lot to do with proper release management. Typically the kernel is released on a time-based schedule, with one release every 2.5 months. So if some feature doesn't make it into a release it can easily be integrated into the following one. What's different with the current release is Greg Kroah Hartman having announced it would be a long term support release, so suddenly devs are trying to get more features into it.

The second question related to a lack of new maintainers joining the community. The reasons Linus sees for this are mainly related to the fact that being a maintainer today is still fairly painful as a job: You need experience to quickly judge patches so the flow doesn't get overwhelming. On the other hand you need to have shown to the community that you are around 24/7, 365 days a year. What he wanted the audience to know is that despite occasional harsh words he loves maintainers, the project does want more maintainers. What's important to him isn't perfection - but having people that will stand up to their mistakes.

One fix to the heavy load mentioned earlier (which was also discussed during the kernel maintainers' panel a day earlier) revolved around the idea of having a group of maintainers responsible for any single sub-system in order to avoid volunteer burnout, allow for vacations to happen, share the load and ease hand-over.

Asked about kernel testing Linus admitted to having been sceptical about the subject years ago. He's a really big fan of random testing/ fuzzing in order to find bugs in code paths that are rarely if ever tested by developers.

Asked about what makes a successful project his take was the ability to find commonalities that many potential contributors share, the ability to find agreement, which seems easier for systems with less user visibility. An observation that reminded my of the bikeshedding discussions.

Also he mentioned that the problem you are trying to solve needs to be big enough to draw a large enough crowd. When it comes to measuring success though his insight was very valuable: Instead of focussing too much on outreach or growth, focus on deciding whether your project solves a problem you yourself have.

Asked about what makes a good software developer, Linus mentioned that the community over time has become much less homogenuous compared to when he started out in his white, male, geeky, beer-loving circles. The things he believes are important for developers are caring about what they do, being able to invest in their skills for a long enough period to develop perfection (much like athletes train a long time to become really sucessful). Also having fun goes a long way (though in his eyes this is no different when trying to identify a successful marketing person).

While Linus isn't particularly comfortable interacting with people face-to-face, e-mail for him is different. He does have side projects beside the kernel. Mainly for the reason of being able to deal with small problems, actually provide support to end-users, do bug triage. In Linux kernel land he can no longer do this - if things bubble up to his inbox, they are bound to be of the complex type, everything else likely was handled by maintainers already.

His reason for still being part of the Linux Kernel community: He likes the people, likes the technology, loves working on stuff that is meaningful, that people actually care about. On vacation he tends to check his mail three times a day to not loose track and be overwhelmed when he gets back to work. There are times when he goes offline entirely - however typically after one week he longing to be back.

Asked about what further plans he has, he mentioned that for the most part he doesn't plan ahead of time, spending most of his life reacting and being comfortable with this state of things.

Speaking of plans: It was mentioned that likely Linux 5.0 is to be released some time in summer 2018 - numbers here don't mean anything anyway.

Nobody puts Java in a container

Jörg Schad from Mesosphere gave an introduction to how container technolgies like Docker really work and how that applies to software run in the JVM.

He started off by explaining the advantages of containers: Isolating what's running inside, supplying standard interfaces to deployed units, sort of the write once, run anywhere promise.

Compared to real VMs they are more light weight, however with the caveat of using the host kernel - meaning that crashing the kernel means crashing all container instances running on that host as well. In turn they are faster to spin up, need less memory and less storage.

So which properties do we need to look at when talking about having a JVM in a container? Resource restrictions (CPU, memory, device visibility, blkio etc.) are being controlled by cgroups. Process spaces for e.g. pid, net, ipc, mnt, users and hostnames are being controlled through libcontainer namespaces.

Looking at cgroups there are two aspects that are very obviously interesting for JVM deployments: For memory settings one can set hard and soft limits. However much in contrast to the JVM there is no such thing as an OOM being thrown when resources are exhausted. For CPUs available there are two ways to configure limits: cpushares lets you give processes a relative priority weighting. Cpusets lets you pin groups to specific cpus.

General advise is to avoid cupsets as it removes one level of freedom from scheduling, often leads to less efficiency. However it's a good tool to avoid cup-bouncing, and to maximise cache usage.

When trying to figure out the caveats of running JVMs in containers one needs to understand what the memory requirements for JVMs are: In addition to the well known, configurable heap memory, each JVM needs a bit of native JRE memory, perm get/ meta space, JIT bytecode space, JNO and NIO space as well as additional native space for threads. With permgen space turned native meta space that means that class loader leaks are capable of maxing out the memory of the entire machine - one good reason to lock JVMs in containers.

The caveats of putting JVMs into containers are related to JRE intialisation defaults being influenced by information like the number of cores available: It influences the number of JIT compilation threads, hotspot thresholds and limits.

One extreme example: When running ten JVM containers in a 32 core box this means that:
  • Each JVM believes it's alone on the machine configuring itself to the maximally availble CPU count.
  • pre-Java-9 the JVM is not aware of cpusets, meaning it will think that it can use all 32 cores even if configured to use less than that.

Another caveat: JVMs typically need more resources on startup, leading to a need for overprovisioning just to get it started. Jörg promised a blog post to appear on how to deal with this question on the DC/OS blog soon after the summit.

Also for memory Java9 provides the option to look at memory limits set through cgroups. The (still experimental) option for that: -XX:+UseCGroupMemLimitForHeap

As a conclusion: Containers don't hide the underlying hardware - which is both, good and bad.

Goal - question - metric approach to community measurement

In his talk on applying goals question metrics to software development management Jose Manrique Lopez de la Fuente explained how to successfully choose and use metrics in OSS projects.

He contrasted the OKR based approach to goal setting with the goal question metric approach. In the latter one first thinks about a goal to achieve (e.g. "We want a diverse community."), go from there to questions to help understand the path ot that goal better ("How many people from underrepresented groups do we have."), to actual metrics to answer that question.

Key to applying this approach is a cycle that integrates planning, making changes, checking results and acting on them.

Goals, questions and metrics need to be in line with project goals, involve management and involve contributors. Metrics themselves are only useful for as long as they are linked to a certain goal.

What it needs to make this approach successful is a mature organisation that understands the metrics' value, refrains from gaming the system. People will need training on how to use the metrics, as well as transparency about metrics.

Projects dealing with applying more metrics and analytics to OSS projects include Grimoire Lab, CHAOSS (Community Health Analytics for OSS).

There's a couple interesting books: Managing inner source projects. Evaluating OSS projects as well as the Grimoire training which are all available freely online.

Container orchestration - the state of play

In his talk Michael Bright gave an overview of current container orchestration systems. In his talk he went into some details for Docker Swarm, Kubernetes, Apache Mesos. Technologies he left out are things like Nomad, Cattle, Fleet, ACS, ECS, GKE, AKS, as well as managed cloud.

What became apparent from his talk was that the high level architecture is fairly similar from tool to tool: Orchestration projects make sense where there are enough microservices to be unable to treat them like pets with manual intervention needed in case something goes wrong. Orchestrators take care of tasks like cluster management, micro service placement, traffic routing, monitoring, resource management, logging, secret management, rolling updates.

Often these systems build a cluster that apps can talk to, with masters managing communication (coordinated through some sort of distributed configuration management system, maybe some RAFT based consensus implementation to avoid split brain situations) as well as workers that handle requests.

Going into details Michael showed the huge takeup of Kubernetes compared to Docker Swarm and Apache Mesos, up the point where even AWS joined CNCF.

For Thursday I went to see Rich Bowen's keynote on the Apache Way at MesosCon. It was great to hear how people were interested in the greater context of what Apache provides to the Mesos project in terms of infrastructure and mentoring. Also there were quite a few questions on what that thing called The Apache Software Foundation actually is at their booth at MesosCon.

Hopefully the initiative started on the Apache Community development mailing list on getting more information out on how things are managed at Apache will help spread the word even further.

Overall Open Source Summit, together with it's sister events like e.g. KVM forum, MesosCon as well as co-located events like the OpenWRT summit was a great chance to meet up with fellow open source developers and project leads, learn about technologies and processes both familiar was well as new (in my case the QEMU on UEFI talk clearly was above my personal comfort zone understanding things - here it's great to be married to a spouse who can help fill the gaps after the conference is over). There was a fairly broad spectrum of talks from Linux kernel internals, to container orchestration, to OSS licensing, community management, diversity topics, compliance, and economics.

Open source summit - Day 2

2017-10-25 10:58
Day two of Open Source summit for me started a bit slow for lack of sleep. The first talk I went to was on "Developer tools for Kubernetes" by Michelle Noorali and Matt Butcher. Essentially the two of them showed two projects (Draft and Brigade to help ease development apps for Kubernetes clusters. Draft here is the tool to use for developing long running, daemon like apps. Brigade has the goal of making event driven app development easier - almost like providing shell script like composability to Kubernetes deployed pipelines.

Kubernetes in real life

In his talk on K8s in real life Ian Crosby went over five customer cases. He started out by highlighting the promise of magic from K8s: Jobs should automatically be re-scheduled to healthy nodes, traffic re-routed once a machine goes down. As a project it came out of Google as a re-implementation of their internal, 15 years old system called Borg. Currently the governance of K8s lies with the Cloud Native Foundation, part of the Linux Foundation.

So what are some of the use cases that Ian saw talking to customers:
  • "Can you help us setup a K8s cluster?" - asked by a customer with one monolithic application deployed twice a year. Clearly that is not a good fit for K8s. You will need a certain level of automation, continuous integration and continuous delivery for K8s to make any sense at all.
  • There were customers trying to get into K8s in order to be able to hire talent interested in that technology. That pretty much gets the problem the wrong way around. K8s also won't help with organisational problems where dev and ops teams aren't talking with each other.
  • The first question to ask when deploying K8s is whether to go for on-prem, hosted externally or a mix of both. One factor pulling heavily towards hosted solution is the level of time and training investment people are willing to make with K8s. Ian told the audience that he was able to migrate a complete startup to K8s within a short period of time by relying on a hosted solution resulting in a setup that requires just one ops person to maintain. In that particular instance the tech that remained on-prem were Elasticsearch and Kafka as services.
  • Another client (government related, huge security requirements) decided to go for on-prem. They had strict requirements to not connect their internal network to the public internet resulting in people carrying downloaded software on USB sticks from one machine to the other. The obvious recommendation to ease things at least a little bit is to relax security requirements at least a little bit here.
  • In a third use case the customer tried to establish a prod cluster, staging cluster, test cluster, one dev cluster per developer - pretty much turning into a maintainance nightmare. The solution was to go for a one cluster architecture, using shared resources, but namespaces to create virtual clusters, role based access control for security, network policies to restrict which services can talk to each other, service level TLS to get communications secure. Looking at CI this can be taken one level furter even - spinning up clusters on the fly when they are needed for testing.
  • In another customer case Java apps were dying randomly - apparently because what was deployed was using the default settings. Lesson learnt: Learn how it works first, go to production after that.

Rebuilding trust through blockchains and open source

Having pretty much no background in blockchains - other than knowing that a thing like bitcoin exists - I decided to go to the introductory "Rebuilding trust through blockchains and open source" talk next. Marta started of by explaining how societies are built on top of trust. However today (potentially accelerated through tech) this trust in NGOs, governments and institutions is being eroded. Her solution to the problem is called Hyperledger, a trust protocol to build an enterprise grade distributed database based on a permissioned block chain with trust built-in.

Marta went on detailing eight use cases:
  • Cross border payments: Currently, using SWIFT, these take days to complete, cost a lot of money, are complicated to do. The goal with rolling out block chains for this would be to make reconcillation real-time. Put information on a shared ledger to make it audible as well. At the moment ANZ, WellsFargo, BNP Paribas and BNY Mellon are participating in this POC.
  • Healthcare records: The goal is to put pointers to medical data on a shared ledger so that procedures like blood testing are being done just once and can be trusted across institutions.
  • Interstate medical licensing: Here the goal is to make treatment re-imbursment easier, probably even allowing for handing out fixed-purpose budgets.
  • Ethical seafood movement: Here the goal is to put information on supply chains for seafood on a shared ledger to make tracking easier, audible and cheaper. The same applies for other supply chains, think diamonds, coffee etc.
  • Real estate transactions: The goal is to keep track of land title records on a shared ledger for easier tracking, auditing and access. Same could be done for certifications (e.g. of academic titles etc.)
  • Last but not least there is a POC to how how to use shared ledgers to track ownership of creative works in a distributed way and take the middleman distributing money to artists out of the loop.

Kernel developers panel discussion

For the panel discussion Jonathan Corbet invited five different Linux kernel hackers in different stages of their career, with different backgrounds to answer audience questions. The panel featured Vlastimil Babka, Arnd Bergmann, Thomas Gleixner, Narcisa Vasile, Laura Abbott.

The first question revolved around how people had gotten started with open source and kernel development and what advise they would have for newbies. The one advise shared by everyone other than scratch your own itch and find something that interests you: Be persistant. Don't give up.

Talking about release cycles and moving too fast or too slow there was a comment on best practice to get patches into the kernel that I found very valuable: Don't get started coding right away. A lot of waste could have been prevented if people just shared their needs early on and asked questions instead of diving right into coding.

There was discussion on the meaning of long time stability. General consensus seemed to be that long term support really only includes security and stability fixes. No new features. Imaging adding current devices to a 20 year old kernel that doesn't even support USB yet.

There was a lovely quote by Narcisa on the dangers and advantages of using C as a primary coding language: With great power come great bugs.

There was discussion on using "new-fangled" tools like github instead of plain e-mail. Sure e-mail is harder to get into as a new contributor. However current maintainer processes heavily rely on that as a tool for communication. There was a joke about implementing their own tool for that just like was done with git. One argument for using something less flexible that I found interesting: Aparently it's hard to switch between subsystems just because workflows differ so much, so agreeing on a common workflow would make that easier.
  • Asked for what would happen if Linus was eaten by a shark when scuba diving the answer was interesting: Likely at first there would be a hiding game because nobody would want to take up his work load. Next there would likely develop a team of maintainers collaborating in a consensus based model to keep up with things.
  • In terms of testing - that depend heavily on hardware being available to test on. Think like the kernel CI community help a lot with that.

    I closed the day going to Zaheda Bhorat's talk on "Love would you do - everyday" on her journey in the open source world. It's a great motiviation for people to start contributing to the open source community and become part of it - often for life changing what you do in ways you would never have imagined before. Lots of love for The Apache Software Foundation in it.