Sunday, September 7, 2008

SourceForge Audience 2




It’s possible to show how the intended audience has developed over time, see chart above. All audiences did grow in more or less the same proportion.
This visualisation of the SourceForge data does show the overall growth, but does not emphasis relative shifts. I will do it next week.

The bend in March 2006 is the result of improved data quality, since then much less projects did not fill the ‘Audience’ field. I don’t know how this was achieved; it couldn’t be a change to a mandatory field since there were still some new projects for which ‘Audience’ was not entered.
Since Jan 2008 the number of projekts without ‘Audience’ information is rising again.

In this view each project counts as one. This is ok for a first try. But nearly half of all projects have not released a singe file. The chart will look different when the relative importance of projects is considered.
It could be interesting to overlay the audience data with the number of downloads (as approximation for importance from a user point of view) and with the number of file releases (as approximation of development activity).

Sunday, August 24, 2008

SourceForge Audience


Who should use all the software, which is available at SourceForge?

Each project has a field ‘Audience’ where the project leader can tell what the target market for his software project is.

When someone searches for a project he can use the Audience field to filter projects.

It’s not required to enter audience information for a project, but only a quarter does not have this information.

It’s allowed to select up to 6 ‘Audiences’ from a given list of 19 entries.

This list is inconsistent, because it mixes user types (developer, end user, sys admin) with branch of trade (manufacturing, aerospace) und profession (quality engineers).

But nevertheless, this is the list:

Audience

Number of projects

%

No topic given

34601

23.3

Developers

38207

25.7

End Users/Desktop

32527

21.9

System Administrators

10392

7.0

Advanced End Users

7185

4.8

Science/Research

5298

3.6

Other Audience

5075

3.4

Information Technology

4813

3.2

Education

4752

3.2

Telecommunications Industry

1162

0.8

Customer Service

834

0.6

Financial and Insurance Industry

681

0.5

Non-Profit Organizations

675

0.5

Quality Engineers

666

0.4

Healthcare Industry

562

0.4

Manufacturing

465

0.3

Government

304

0.2

Religion

233

0.2

Aerospace

154

0.1

Legal Industry

125

0.1

Sum

148711

100

The top 4 entries make 60%, which is audience by user type.

The top entry with 26% is ‘Developers’. This is an expected result. A lot of projects are developed by developers for themselves.

After the success of Linux and open source programs for the server like Apache the desktop was seen as the next frontier, the current share is 22%

Industry specific software is rare, for all industries together less than 5 %.


It would be interesting to know if the share of projects for developers was higher in the past. I guess that it was a bit higher.

Over time an idea like ‘open source’ reaches new target groups. This means that the people using open source software now are different from the open source users 10 years ago. This should have an impact on the kind of new projects started.

Monday, August 18, 2008

SourceForge - From 0 to 150.000 projects in 8.5 years

SourceForge was launched in November 1999 by company VA Linux. The company was later renamed to VA Software (2001) and then to SourceForge, Inc (2007).

SourceForge is the largest repository for open source projects. In Aug 2008 more than 180.000 open source projects are hosted.


Open source projects could be loaded to SourceForge from Nov 1999 onwards.

During the first month 184 were registered on SourceForge.

At least 184 of them still existed in April 2008 when I checked the data. It is possible to delete projects from SourceForge, but the number of deletions is fairly low (I will analyse this is a later post).

The first day was Nov 4th, 1999. 11 projects had been registered then. E.g.

gedit http://sourceforge.net/projects/gedit/

Xemacs http://sourceforge.net/projects/xemacs/

Mesa3D http://sourceforge.net/projects/mesa3d/

Enlightenment http://sourceforge.net/projects/enlightenment/

All still very active nearly 9 years later.

The following growth was huge. Each year more projects were registered than the year before.

10.000 projects were reached after 1.5 years.

100.000 projects were reached after 6.5 years.

The current count is 180.000 projects – after 8.5 years.

I have started to collected SourceForge data in April 2008 (technical details will be another post).

The number of projects I use for my analysis are 148.711 (until around April 2008), or 139.771 when I compare complete years 2000 – 2007.

The early growth was exponential. At the moment there is a linear growth, 2007 was the first year when less projects were registered than in the year before. It was a tiny difference only, -150. We will see how the trend continues in 2008.

I expect a decreasing growth in the years ahead.

Year

SF new projects per year

1999 (2 months)

408

2000

5.302

2001

9.825

2002

13.617

2003

15.429

2004

18.979

2005

21.588

2006

27.591

2007

27.440

2008 (3 months)

8.164

A slower growth must not be a bad thing. The number of programmers working on open source projects is growing, but limited.

It’s fun to create a new project – and more ego-boosting, but perhaps it makes more sense to support an already existing project.

Growth of existing projects instead of new projects

And who needs 180.000 different software projects?

Of cause this is the number of registered projects. The number of active projects will be smaller. How much smaller I will try to find out.

In the chart above the gray area represents the huge project growth from zero to 150.000 projects in 8.5 years.

I will try to bring some structure into this gray area.

Saturday, August 9, 2008

Start: Open Source Dive

Open source projects on SourceForge

I use a couple of open source programs which are hosted on SourceForge http://www.sf.net ; e.g. KeePass, 7-zip, Zeos, JVCL, Firebird, FlameRobin, XAMP and PDFCreator.

There currently more than 150.000 projects hosted on SourceForge. Shouldn’t I find there more good programs.
Until a few months ago SourceForge showed the number of projects and the top 10 projects by downloads and activity on there entry page. Unfortunately they no longer show this information.
Is there another web page where they show some statistics?

There are a few projects which collect data about open source in general and SourceForge especially, e.g. FLOSSmole, http://ossmole.sourceforge.net
The FLOSSmole extract of project names for June 08 shows 153.843 different project names.

This is a huge number of projects.
How has this number evolved?
How many projects are live?
How many projects are for Linux or Windows or another operating system?

Let’s try to find out some facts about SourceForge.