Title: COMS W4156: Advanced Software Engineering
1COMS W4156 Advanced Software Engineering
- Prof. Gail Kaiser
- Kaiser4156_at_cs.columbia.edu
- http//york.cs.columbia.edu/classes/cs4156/
2What is Open Source Software?
- Open source usually refers to a program in which
the source code is available to the general
public for use and/or modification from its
original design free of charge. - Open source code is typically created as a
collaborative effort in which programmers improve
upon the code and share the changes within the
community. - The rationale for this movement is that a larger
group of programmers not concerned with
proprietary ownership or financial gain will
produce a more useful and bug-free product for
everyone to use. - The concept relies on peer review to find and
eliminate bugs in the program code, a process
that commercially developed and packaged programs
do not employ.
3Technical Case
- Central part of engineering tradition, part of
working method almost by instinct, for Internet
and Unix hackers. - The running gears of the Internet are
astonishingly reliable relative to their nearest
commercial equivalents. - TCP/IP, DNS, sendmail, Perl, Apache,
4Economic Case
- There are companies making money programming
open-source software right now. - If having a program written is a net economic
gain for a customer over not having it written, a
programmer will get paid whether or not the
program is going to be free after it's done.
5Economic Value
- The use value of a program is its economic value
as a tool. - The market value of a program is its value as a
saleable commodity. - The monopoly value is the value you gain not just
from having the use of a program but from having
it be unavailable to your competitors.
6Open-Source Doomsday (1)
- The market value and monopoly value of software
goes to zero because of all the free sources out
there. - Use value alone doesn't attract enough consumers
to support software development.
7Open-Source Doomsday (2)
- The commercial software industry collapses.
- Programmers starve or leave the field.
- Doomsday arrives when the open-source culture
itself (dependent on the spare time of all these
pros) collapses, leaving nobody around who can
program competently.
8Shaky Assumption 1 Programming will collapse if
software has no market value
- Proportion of all code written in-house at
companies other than software vendors gt75. - Includes most MIS the financial- and
database-software customizations - Also includes OEM software like device drivers
and embedded code for our increasingly
microchip-driven machines.
9Shaky Assumption 1 (cont) Programming will
collapse if software has no market value
- Most vertical code is integrated with its
environment in ways that make reusing or copying
it very difficult. - This is true whether the environment is a
business office's set of procedures or the
fuel-injection system of a combine harvester.
10Combine Harvester
11Shaky Assumption 1 (cont) Programming will
collapse if software has no market value
- Thus, as the environment changes, there is a lot
of work continually needed to keep the software
in step. - Maintenance makes up the vast majority of what
programmers get paid to do. - And it will still need to be done, even if/when
most software is open-source.
12Shaky Assumption 1 (cont) Programming will
collapse if software has no market value
- Between originating, customizing and maintaining
vertical code (and related tasks like system
administration and troubleshooting), the use
value of software would still support the
millions of good jobs in that 75 even if all
horizontal or standalone software were free.
13Shaky Assumption 2 Open-source software has no
market value
- Red Hat (among others) has built a flourishing
business selling software you can download for
free from Red Hat's own web site! - What you're really buying from them is
handholding and support for the free stuff they
sell - a single place to go when you have
problems.
14Shaky Assumption 3 Open-source software has no
monopoly value
- Adopting or even just studying someone else's
software is not a costless, frictionless process
you need to dedicate skilled time to it. - As product cycle times drop, coattail-riding gets
less attractive, because the payoff period
shrinks relative to the time you had to dedicate.
15Shaky Assumption 3 Open-source software has no
monopoly value (cont)
- And time your skilled people spend studying
someone else's monopoly code is time you're
spending getting to where the competition used to
be (rather than where they are now).
16Business Case 1
- High reliability
- Open-source software is peer-reviewed software
it is thus more reliable than closed, proprietary
software. - Mature open-source code is as bulletproof as
software ever gets.
17Business Case 2-N
- Development Speed
- Lower Overhead
- Closeness to the Customer
- Broader Market
- Grab Mind Share (e.g., for startups)
18Investor Case 1-2
- Support Sellers give away the software product,
but sell distribution, branding, and after-sale
service. - Loss Leaders give away open-source as a
loss-leader and market positioner for closed
software.
19Investor Case 3-4
- Widget Frosting a hardware company goes
open-source in order to get better drivers and
interface tools cheaper. - Accessorizing selling accessories -- books,
compatible hardware, complete systems with
open-source software pre-installed.
20Customer Case 1
- Open source model applies even to internally
developed software. - You are your developers customer!
- Freedom from legal entanglements such as tracking
copies and usage. - Very hard to do accurately.
21Customer Case 2
- Higher Security
- Security through obscurity just does not work.
- Closed sources create a false sense of security.
- The bad guys will always find the holes, but the
good guys will not find holes and fix them. - It is harder to distribute trustworthy fixes when
a hole is revealed.
22Marketing Case
- Why not call it, as we traditionally have, free
software? - The term free software has a load of fatal
baggage to a businessperson, it's too redolent
of fanaticism and flakiness and strident
anti-commercialism.
23Marketing Case (cont)
- In marketing appearance is reality. The
appearance that we're willing to climb down off
the barricades and work with the corporate world
counts for as much as the reality of our
behavior, our convictions, and our software.
24Free Software Foundation
- The Free Software Foundation (FSF) is dedicated
to eliminating restrictions on copying,
redistribution, understanding, and modification
of computer programs. - Free software is a matter of liberty, not
price. - Think free speech, not free beer.
25Free Software Tenets
- The freedom to run the program, for any purpose.
- The freedom to study how the program works, and
adapt it to your needs. Access to the source code
is a precondition for this.
26Free Software Tenets (cont)
- The freedom to redistribute copies so you can
help your neighbor. - The freedom to improve the program, and release
your improvements to the public, so that the
whole community benefits. Access to the source
code is a precondition for this.
27Example Open Source License
- Gnu General Public License (GPL), developed by
Richard Stallman and the Free Software Foundation
starting in 1985 - Certified by OSI
28Open Source Initiative (OSI)
- http//www.opensource.org/
- Open source is a development method for software
that harnesses the power of distributed peer
review and transparency of process. The promise
of open source is better quality, higher
reliability, more flexibility, lower cost, and an
end to predatory vendor lock-in.
29Open Source Initiative (OSI)
- The Open Source Initiative (OSI) is a non-profit
corporation formed to educate about and advocate
for the benefits of open source and to build
bridges among different constituencies in the
open-source community. - One of our most important activities is as a
standards body, maintaining the Open Source
Definition for the good of the community. The
Open Source Initiative Approved License trademark
and program creates a nexus of trust around which
developers, users, corporations and governments
can organize open-source cooperation.
30Open Source Definition
- Open source doesn't just mean access to the
source code. - The distribution terms of open-source software
must comply with the following criteria
upcoming slides
31Open Source Definition
- 1. Free Redistribution
- The license shall not restrict any party from
selling or giving away the software as a
component of an aggregate software distribution
containing programs from several different
sources. The license shall not require a royalty
or other fee for such sale.
32Open Source Definition
- 2. Source Code
- The program must include source code, and must
allow distribution in source code as well as
compiled form. Where some form of a product is
not distributed with source code, there must be a
well-publicized means of obtaining the source
code for no more than a reasonable reproduction
cost, preferably downloading via the Internet
without charge.
33Open Source Definition
- 2. Source Code
- The source code must be the preferred form in
which a programmer would modify the program.
Deliberately obfuscated source code is not
allowed. Intermediate forms such as the output of
a preprocessor or translator are not allowed.
34Open Source Definition
- 3. Derived Works
- The license must allow modifications and derived
works, and must allow them to be distributed
under the same terms as the license of the
original software.
35Open Source Definition
- 4. Integrity of The Author's Source Code
- The license may restrict source-code from being
distributed in modified form only if the license
allows the distribution of "patch files" with the
source code for the purpose of modifying the
program at build time.
36Open Source Definition
- 4. Integrity of The Author's Source Code
- The license must explicitly permit distribution
of software built from modified source code. The
license may require derived works to carry a
different name or version number from the
original software.
37Open Source Definition
- 5. No Discrimination Against Persons or Groups
- The license must not discriminate against any
person or group of persons.
38Open Source Definition
- 6. No Discrimination Against Fields of Endeavor
- The license must not restrict anyone from making
use of the program in a specific field of
endeavor. For example, it may not restrict the
program from being used in a business, or from
being used for genetic research.
39Open Source Definition
- 7. Distribution of License
- The rights attached to the program must apply to
all to whom the program is redistributed without
the need for execution of an additional license
by those parties.
40Open Source Definition
- 8. License Must Not Be Specific to a Product
- The rights attached to the program must not
depend on the program's being part of a
particular software distribution.
41Open Source Definition
- 8. License Must Not Be Specific to a Product
- If the program is extracted from that
distribution and used or distributed within the
terms of the program's license, all parties to
whom the program is redistributed should have the
same rights as those that are granted in
conjunction with the original software
distribution.
42Open Source Definition
- 9. License Must Not Restrict Other Software
- The license must not place restrictions on other
software that is distributed along with the
licensed software. For example, the license must
not insist that all other programs distributed on
the same medium must be open-source software.
43Open Source Definition
- 10. License Must Be Technology-Neutral
- No provision of the license may be predicated on
any individual technology or style of interface.
44Open Source Licenses
- Open Source Licenses (by name or by category)
comply with the Open Source Definition and are
listed here after going through the approval
process. We also track the approval status of
licenses.
45The Cathedral and the Bazaar
- Eric Raymond (esr)
- first presented May 1997, ongoing revision
through September 2000 - http//www.firstmonday.org/issues/issue3_3/raymond
/ or - http//www.catb.org/esr/writings/cathedral-bazaar
/cathedral-bazaar/
46The Cathedral
- Draws an analogy between traditional closed
source development and a cathedral, in which
there is a rigid hierarchy among developers,
managers, testers, etc.
47The Cathedral (cont)
- esr originally believed that the most important
software (operating systems and really large
tools like the Emacs programming editor) needed
to be built like cathedrals, carefully crafted by
individual wizards or small bands of mages
working in splendid isolation, with no beta to be
released before its time.
48The Bazaar
- Likens open source projects to Middle Eastern
bazaars, where numerous merchants hawk their
wares loudly to passersby. - Little hierarchy among contributors.
- Contributors compete to have their modifications
inserted into the next release, bringing
recognition and reputation.
49Based Primarily on Linux
- Describes Linus Torvalds' style of development
as release early and often, delegate everything
you can, be open to the point of promiscuity.
50Open Source Lessons (1)
- Every good work of software starts by scratching
a developer's personal itch. - Good programmers know what to write. Great ones
know what to rewrite (and reuse). - Plan to throw one away you will, anyhow. (Fred
Brooks, The Mythical Man-Month, Chapter 11)
51Open Source Lessons (2)
- If you have the right attitude, interesting
problems will find you. - When you lose interest in a program, your last
duty to it is to hand it off to a competent
successor. - Treating your users as co-developers is your
least-hassle route to rapid code improvement and
effective debugging.
52Open Source Lessons (3)
- Release early. Release often. And listen to your
customers. - Given a large enough beta-tester and co-developer
base, almost every problem will be characterized
quickly and the fix obvious to someone. - Smart data structures and dumb code works a lot
better than the other way around.
53Open Source Lessons (4)
- If you treat your beta-testers as if they're your
most valuable resource, they will respond by
becoming your most valuable resource. - The next best thing to having good ideas is
recognizing good ideas from your users. Sometimes
the latter is better. - Often, the most striking and innovative solutions
come from realizing that your concept of the
problem was wrong.
54Open Source Lessons (5)
- Perfection (in design) is achieved not when there
is nothing more to add, but rather when there is
nothing more to take away. - Any tool should be useful in the expected way,
but a truly great tool lends itself to uses you
never expected.
55Necessary Preconditions for the Bazaar Style (1)
- One cannot code from the ground up in bazaar
style. - One can test, debug and improve in bazaar style,
but it would be very hard to originate a project
in bazaar mode. - Your nascent developer community needs to have
something unable and testable to play with.
56Necessary Preconditions for the Bazaar Style (2)
- To start community-building, what you need to be
able to present is a plausible promise. - Your program doesn't have to work particularly
well. It can be crude, buggy, incomplete, and
poorly documented.
57Necessary Preconditions for the Bazaar Style (3)
- What it must not fail to do is (a) run, and (b)
convince potential co-developers that it can be
evolved into something really neat in the
foreseeable future e.g., through strong,
attractive basic design.
58Bazaar Project Coordinator
- It is not critical that the coordinator be able
to originate designs of exceptional brilliance,
but it is absolutely critical that the
coordinator be able to recognize good design
ideas from others. - Open-source community's internal market in
reputation exerts subtle pressure on people not
to launch development efforts they're not
competent to follow through on.
59Bazaar Project Coordinator (cont)
- Must have good people and communications skills.
- In order to build a development community, you
need to attract people, interest them in what
you're doing, and keep them happy about the
amount of work they're doing.
60What Happened to Mythical Man Month?
- Fred Brooks The Mythical Man-Month observed
that programmer time is not fungible - adding
developers to a late software project makes it
later. - He argued that the complexity and communication
costs of a project rise with the square of the
number of developers, while work done only rises
linearly.
61What Happened to Mythical Man Month? (cont)
- This claim has since become known as Brooks'
Law and is widely regarded as a truism. - But if Brooks' Law were the whole picture, Linux
would be impossible.
62Egoless Programming
- Gerald Weinberg The Psychology Of Computer
Programming supplied a vital correction to
Brooks. - His discussion of egoless programming observed
that in shops where developers are not
territorial about their code, and encourage other
people to look for bugs and potential
improvements in it, improvement happens
dramatically faster than elsewhere.
63But
- That cant be the whole story
64Who Invented Open Source?
- No one knows
- Some say Linus Torvalds, initial developer of
Linux (c. 1992) - Some say Richard Stallman, founder of GNU Project
(c. 1985) - But lots of earlier software is public domain,
e.g., original implementations of TCP/IP, DNS,
sendmail, various other networking software
65Linus' Law according to Eric S. Raymond
- Given enough eyeballs, all bugs are shallow
66Linus' Law according to Linus Torvalds
- "Linus's Law says that all of our motivations
fall into three basic categories. More important,
progress is about going through those very same
things as 'phases' in a process of evolution, a
matter of passing from one category to the next.
The categories, in order, are 'survival', 'social
life', and 'entertainment'." - Himanen, Pekka Linus Torvalds, Manuel Castells.
The Hacker Ethic. Random House, 2001. ISBN
0-375-50566-0.
67Why Didnt Open Source Happen Earlier?
- Well, it did
- Thats where Unix and the Internet came from
- But in a relatively small academic-oriented
circle - And the pace was very slow
68Why Didnt Open Source Happen Earlier?
- Legal constraints of various licenses, trade
secrets, and commercial interests (e.g., wrt
Unix) - The Internet wasn't (yet) good enough egoless
programming could only work in geographically
compact communities.
69Web and ISP Industry
- Linux was the first project to make a conscious
and successful effort to use the entire world as
its talent pool. - The gestation period of Linux coincided with the
birth of the World Wide Web, and Linux left its
infancy during the same period in 1993-1994 that
saw the takeoff of the ISP industry and the
explosion of mainstream interest in the Internet.
70An Opposing ViewRobert Glass, Open Source and
Hype, 7 July 2003
- http//www.stickyminds.com/sitewide.asp?ObjectId6
535FunctionDETAILBROWSEObjectTypeCOL
71Open Source and Hype
- Most of what I dislike about the open source
movement can be summed up in one word Hype.
Unfortunately, and perhaps surprisingly, the
advocates of open source are no better in this
regard than their proprietary colleagues.
72Best People
- The claim is frequently made that open source
programmers are the best programmers around. - Attempts to define Programmer Aptitude Tests,
which evaluate the capabilities of subjects to
become good programmers, have historically been
failures. - Although some programmers are better than others,
nothing in the fields' research suggests that we
have found an objective way of determining who
those best people are. - Since we can't identify who the best people are,
there is no way to study the likelihood of them
being open source programmers.
73Most Reliable
- The claim is also frequently made that open
source software is the most reliable software
available. - Open source advocates claim that a study
identified as the "Fuzz Papers" produced results
that showed that their software was more reliable
than proprietary alternatives. - But the Fuzz Papers have virtually nothing to say
about open source software, one way or the other,
and their author (according to Glass) agrees with
that assessment.
74The Fuzz Papers
- http//www.cs.wisc.edu/bart/fuzz/fuzz.html
- Fuzz testing a simple technique for feeding
random input to applications. - The input is completely random.
- Fuzz testing does not use any model of program
behavior, application type, or system
description.
75The Fuzz Papers
- The reliability criteria is simple if the
application crashes or hangs, it is considered to
fail the test, otherwise it passes. - Note that the application does not have to
respond in a sensible manner to the input, and it
can even quietly exit. - Fuzz testing can be automated to a high degree
and results can be compared across applications,
operating systems, and vendors.
76Back to Open Source and HypeMore Secure
- There are many claims that open source is more
secure than proprietary. - There is very little evidence on either side
regarding open source software and security. - Security holes have been found in proprietary
software. - Security holes have been found in open source
code. - It is all too easy for programmers to leave
holes, independent of how the code is written
77Robert Glass Concludes
- So where do I stand on open source? I see
nothing in particular, wrong with its fundamental
ideas and ideals. But I see plenty wrong with the
hype surrounding it. Not that it's any worse than
its proprietary brethren in this respect. It's
just that I expected more from this particular
group! Yes, I do expect more from the open source
advocates.
78Open Systems ? Open Source
- Open systems concept predates OSI and FSF and
may be more palatable to conventional software
vendors - Concerned with system integration
- Standard interfaces and protocols
- Includes but not limited to component model
frameworks - Industry consortia The Open Groups Single Unix
Spec, Object Management Groups CORBA, W3Cs XML
79Next Assignment
- Code inspection week Tuesday November 20th
through Thursday November 29th - Second Iteration Progress Report due Friday
November 30th
80Upcoming Deadlines
- Code inspection week Tuesday November 20th
through Thursday November 29th - Second Iteration Progress Report due Friday
November 30th - Demo week Monday December 3rd through Monday
December 10th - Second Iteration Final Report due Tuesday
December 11th
81COMS W4156 Advanced Software Engineering
- Prof. Gail Kaiser
- Kaiser4156_at_cs.columbia.edu
- http//york.cs.columbia.edu/classes/cs4156/