Title: Unison and replicating TWiki
1Unison and replicating TWiki
- A talk for the Toronto Perl Mongers
- Martin_at_Cleaver.org
- 24 Nov 2005
- (Many slides herein are replicated from other
presentations found on the internet)
2About me
- Background
- BSc / MSc Computing Science (Newcastle
University, UK, 1994) - Masters of Business Administration (Melbourne
Business School, Australia, 2004) - Ex-programmer for Reuters International
Technology Product group - Ex-manager at Arthur Andersen
- Interests
- Concept Mapping
- Wiki
- Organisational Design, Innovation
- Middleware (Message Oriented MQ, TIBCO Message
Brokers) - Currently working at Helix Commerce International
- http//www.helixcommerce.com
- http//www.collaborationcommerce.com
- Specialists in Collaboration, Social Network
Analysis, Portal work - Draws on a large Network of experts in the
Greater Toronto Area
3Motivation
- Backup my wiki on the hosting provider
- Wiki on the road
- Sync to my palm top
- Because I want to sync the digimemo
- and because its fun
4Read Write Offline Wiki
- A read write offline Wiki is for people in the
field who have a need to change content while
offline. (Topic started in OfflineWiki) - Pros
- Can edit and search content while offline.
- Webs shared this way cannot be censored or
controlled by an single individual or small group
of individuals without significant implementation
overhead. MS - Can be implemented as a bolt-on, and "grown
into", rather then increasing initial install
hurdle MS - Cons/catch
- Setup issues Web server and TWiki needs to be
installed on client. - Intelligent merge is necessary, like for example
a TWikiWithCVS. - Webs shared in this way lose the ability to
perform access control, unless a heavyweight,
tightly integrated approach is taken MS - How
- Work on content independently on different TWiki
installations. - Synchronize periodically.
- http//twiki.org/cgi-bin/view/Codev/ReadWriteOffli
neWiki
5Other domains where data synchronization is useful
- Synchronizing files on a PDA
- Synchronizing directory services such as an LDAP
or Active Directory server - Any application where there is a mobile
workforce - E.g. Synchronizing data in Customer Relationship
Management (CRM) and Sales Force Automation (SFA)
application between a compact mobile database on
a PDA and a full-size relational database
6Todays toolset
- TWiki Dakar
- A rewrite of the guts of TWiki
- Truly free and open source Wiki
- gt90 compatible
- A shift from coreteam to collaboration
- Some functional improvements
- Some speed improvements
- Test suites
- Massively restructured
- Unison
- File synchroniser
- Think My briefcase, but for Gigabytes of data
- Academic Project by University of Pennsylvania
- Robust, portable and very useful
7Wiki
8What is a wiki?
- A wiki is a website where a community
collectively maintains and socializes ideas
through a set of named hypertext pages. - Uses include
- Terminology database
- Meeting minutes
- Initiative Status pages
- Personal pages
- Whats going on
- Users each have a log-in and are granted access
to edit pages. - A recent changes shows who has made what change
- Every page is revision controlled.
- Every alteration on a page can be tracked back to
the user that made the change - Subscribers are notified by email as changes
occur
9What is a wiki?
- A wiki is an online space based on a set of
discussion documents interconnected with key
words. Participants can add a discussion or can
build on each other's thoughts by adding or
moving the text around between existing
documents. In this way a set of interconnected
documents emerge that represents the sum of ideas
in play. - The key to collaboration on a Wiki is
interjecting and building thoughts at the right
place of an existing document - ideas each get
their own "watercooler", a place named according
to key words where discussions on that topic are
1) collected and 2) continually synthesised. - These key words become common knowledge and
become a linkage between verbal and online
discussion. - Wikis are characterised by both a 'Recent
Changes' page which sorts conversation by the
last contribution and a 'Notifications' feature,
which allows participants to be notified (by
email, messenger) that new topics have been
posted or that topics of interest have been
modified. - The result is a place where individuals and
groups can define, refine and relate terminology.
The goal is conceptual integration, such that the
terminology evolves to greater inclusiveness and
power through integrating of wider number of
opinions whilst maintaining higher definitional
power. - Conceptual models can be discussed and refined
- The models provide seed for new ideas
- Ideas are fleshed out until socially accepted
- New stakeholders can see and resurrect previous
discussions
10Many Wikis to choose from
- TWiki
- KWiki
- MediaWiki
- TikiWiki
- 100s
11Unison
- http//www.cis.upenn.edu/bcpierce/unison/index.ht
ml
12Unison (v 2.7.7)
- GNU Public License
- Runs on both Windows (95, 98, NT, and 2k) and
Unix (Solaris, Linux, etc.) - User-level program
- Single executable
- Deals with symlinks, file permissions, modtimes,
uids, etc. - Deals with updates to both replicas
- Non conflicting updates propagate automatically
- Works between any pair of directories
- Works between any pair of machines
- direct socket link
- tunneling over rsh or ssh
- Resilient to failure
- Tuned for high (ethernet) and medium (PPP)
connections - Uses the rsync
- 15k lines of OCaml
Banging your head against a wall uses 150
calories an hour.
13Synchronization (a simple example)
14As long as they do not conflict
15A more interesting example
16Three reasonable possibilities
17Heterogenity
18System Architecture
19Robustness
20Merging
21Controlling Unison
- By either
- Profiles
- Command line switches
- Text Only
- Useful for batch
- Used by SyncContrib
- GUI
- Fantastic for debugging
- Needed for reconciling conflicts
22Unison File Synchronizer
1.Create a profile that map the Sources path and
Synchs path.
23File Synchronizer Tools (Continue)
Source Directory
Synch Directory
24File Synchronizer Tools (Continue)
Source Directory
2. Some changed apply to the Source Directory.
Synch Directory
25File Synchronizer Tools (Continue)
3.Update the non conflict structure.
A reconciler
26Options
27Sync Contrib
28Problem Definition
- Typically laptop and server
- Typically cross-platform
- Inserts, deletes, updates could happen on either
server - Want to make updates made on either available on
both - Fairly tricky to install on a laptop
29The problem
- What I set out to solve
- Backup from a TWiki server to my laptop
- Using a free tool
- That could run automatically
- What I actually solved
- Two way synchronisation of TWiki
- Free tools
- That can be run interactively or in the
background - ReadWriteOfflineWiki
- But its not perfect
30Two Wiki servers
Hosting Provider
Laptop
Apache Perl
TWiki A
TWiki B
C\moreprgs\IndigoPerl\apache C\moreprgs\IndigoPe
rl\perl
Unison
Unison
C\moreprgs\Unison\Unison.exe
SSH Access
Call through ssh to unison -server
Windows XP on my laptop IndigoPerl 5.8 TWiki
Dakar
Linux Debian Sarge _at_ Dreamhost Debian Perl
5.8 TWiki (any with TWiki Store 1.0)
Unison does have a server mode too
31Multiple sites merged into one composite site On
TWiki I use nested webs
Server 3
Server 1
Site A
Site A web
Site A
Web A
SubWeb A
Web B
SubWeb B
Webs inside sites
Site B
Site B web
Web C
SubWeb C
Site C web
Server 2
SubWeb D
Site C
Web D
SubWeb C
Web C
32Inside each web is pub (attachments) and data
pub
data
pub
data
Web/ MyTopic.txt AnotherTopic.txt
Web/ MyTopic/ AttachmentA.jpg
AttachmentA2.doc AnotherTopic/
AttachmentB.gif
Web/ MyTopic.txt AnotherTopic.txt
Web/ MyTopic/ AttachmentA.jpg
AttachmentA2.doc AnotherTopic/
AttachmentB.gif
Unison
- Syncs each directory separately
- Unison GUI appears multiple times
- Does not use profiles
- Your changes in the GUI have no lasting effect
- Ignores templates
- So skins too
- Ignores bin
- So functionality too
- Ignores lib
- So plugins too
- Do plugins really belong in the same hierarchy as
TWiki?
33Sync Contrib
- A script loosely coupled to TWiki
- Knows about the layout of TWikis filesystem
- Sets up the command line options for unison
- Writes out a temporary intermediate launcher
script
34Sync Contrib configuration
- the default unison profile must not exist
- you must have set up a shared key
- pageant must be running
- designed to be run on a windows box
- Common debug options
- SYMPTOM
- Two or more files on a Unix system have names
identical except for case. - They cannot be synchronized to a Windows file
system. - DIAGNOSIS
35Sync Contrib Sample Configuration File
- UnisonSyncoptions
- 'syncAccounts' gt 'TestAccountRemote',
- 'clientServerPrivateKey' gt 'c\\Documents and
Settings\\Martin Cleaver\\PuttyPrivateKey.ppk', - 'clientRoot' gt 'c/moreprgs/indigoperl/apache
/TWiki', -
- 'plinkExecutable' gt 'c\\program
files\\putty\plink.exe', - 'plinkTempLauncherScriptFile' gt
'c\\temp\\plinkLauncher.bat', - 'clientUnisonExecutable' gt
'c\\moreprgs\\unison\\unison-2.12.15-win-gui.exe'
, - 'serverUnisonExecutable' gt 'unison',
-
- 'unisonStdErrFile' gt 'timestamp-accountNam
e.stdErr', - 'unisonCaptureErrors' gt '2gt
unisonStdErrFile', - 'unisonCaptureLog' gt 'timestamp-accountNam
e.uniLog', - 'unisonLogfile' gt 'timestamp-accountName.
log', - 'unisonUimode' gt '-ui text',
- 'unisonOptions' gt 'unisonBatchMode
unisonDebugMode unisonUimode -logfile
unisonCaptureLog unisonIgnoreTWikiHistory
unisonTimeOutProtection', - 'unisonIgnoreTWikiHistory' gt '-ignore "Regex
.,v"', - 'unisonTimeOutProtection' gt '-sshargs "-o
ServerAliveInterval60"', - 'unisonDebugMode' gt '-debug none', none,
all, update
36Config cont.
-
- 'dataDir' gt 'data',
- 'pubDir' gt 'pub',
- 'protocol' gt 'ssh',
-
- 'accounts' gt
- 'TestAccountRemote' gt
- serverRoot gt '/home/mrjc/cairotwiki.mrjc.com/t
wiki', - serverAccount gt 'mrjc',
- serverSite gt 'mrjc.com',
- webs gt 'Sandbox',
- debug gt '2',
- dryrun gt '',
- ,
- 'TestAccountLocal' gt
- clientRoot gt 'c/temp/syncContribTest/TWiki',
- serverRoot gt 'c/moreprgs/indigoperl/apa
che/TWiki', - webs gt 'Sandbox', 'CleaverSite',
- unisonOptions gt '-backups',
Overrides the config keys above
37Config cont.
-
- 'Palm' gt
- serverRoot gt 'd',
- ,
- 'WikiConsultingSite' gt
- serverRoot gt '/home/mrjc/wikiconsulting.com/tw
iki', - serverAccount gt 'mrjc',
- serverSite gt 'mrjc.com',
- webs gt 'WikiConsulting', 'People',
'PublicWikis', 'KmSurvey2003', 'PublicWikisOvervie
w', 'WikiHistory', 'WikiHosting',
'WikiSuccessStories', 'WikiTechnologies', - debug gt '0'
- ,
-
-
-
- 1
38Insurance (i.e. in the eventuality of no
connection)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42Filesystem Replication vs. Messaging Middleware
- Messaging middleware is another way
- Some free implementations
- E.g. MQ
- Need to build more into the application from the
outset - Not needed if you are just doing backup
- Changing data under an app can be a bad thing
- RSS might work
43Broader Issues
44What TWiki does to make this hard
- The files are in two places (Data, Pub) you have
to synchronise - They always go together, so really they ought
stored together - Some random files that you dont want to
synchronise - .changes
- Thumbs.db files
- Broader problems
- Synchronising user identities
45Summary Where we are today
- Text and Attachment replication
- No conflict resolution
- But the right hook is there in TWikiMerge
- Practical applications
- Useful for backup
- Useful for publishing
- Useful for laptop work, publish off-line
- Need a ssh account on the server gt open access
- Gain the rights of that user
- There is a server mode in unison but it does not
permit simultaneous connections
- Future
- Store implementations will change
- Need to sync between different types of store
- Different webs on same system could have
different storage methods (1 yr ) - Tell TWiki when files are dumped under it
- Automatic attachments (TWiki will attach a file
dumped in its attach directory)
46What TWiki does to make this easy
- No database
- Everything is a human-readable and human-fixable
flat file - Hierarchical Webs
47Limitations
48Limitations
- Limitations
- Doesnt handle merges well
- Cant handle more than client-server
- Merges transactions
- E.g.
- Server A Revisions made to version 1.1 makes
1.2, 1.3 - Server B No revision made
- Replicate
- 1.2 1.3 are seen as a single change on Server
B, making version 1.2 - Server A is at 1.3. Server B is at 1.2
- Cant do conflicts
- Server A Revisions made to version 1.1 makes
1.2, - Server B Revision made to version 1.1 makes 1.2
(different) - Replicate
- Conflict files are made on both machines
- Copies no longer replicate until conflict files
are dealt with
49What it actually transmits is the head revision
- As long as the HEAD revision on both machines has
not changed, it replicates the newer - If it detects a conflict, Unison will call a
predefined Merge. - Presently I dont have any call hooked into
merge. Conflicts are just dumped side by side on
the disk. - These are not visible from a distance
- Need a halt on conflict switch
50Limitations
Backup
Peer to peer
C1
S
Client Server
C2
C3
S3
Federated
S2
C1
S
C2
51Simultaneous updates leave questions
Topic
Topic
Compare
Topic
Topic
In what order should the updates be placed into
the resulting topic?
52Future of SyncContrib
- Transaction log transmission
- Instead of shunting the HEAD copy of
client/server back and forth, pass the set of
changes made since last revision - Eg. Server A on Revision 1.1, makes Revision 1.2,
1.3 - Server B up to date with Server As Revision 1.1
- Server B should get 1.2 1.3 diff from Server A
- Make config and logging web-visible
- Write to a wiki-compatible file format
- Expose as an attachment
53Appendices
54How to create a shared SSH PuTTY (Windows) - SSH
(Unix) key
- Create shared SSH PuTTY (Windows) - SSH (Unix)
key - Install PuTTY and add the folder to the PATH
environment variable. - Generate OpenSSH keys with puttygen.
- I left the config at SSH2 RSA.
- Add a private passkey phrase
- Save the private key in "c\documents and
settings\Martin Cleaver\PuttyPrivateKey.ppk" - For some reason it did not ask for an extension
but I did this to match the instructions below - Start pageant
- A icon appeared in the bar next to the clock
- I right clicked this, did add key and picked my
PuttyPrivateKey.ppk - Putty confirmed that it had loaded this.
- Back In puttygen, copy and paste the generated
public key and append it to HOME/.ssh/authorized_
keys on the server. This will be a one-line entry
as required by OpenSSH. - chmod og-w HOME/.ssh/authorized_keys
- To keep PuTTY's pageant running
- start it from the Start Menu with the private
key as parameter - pageant.exe "D\somefolder\putty-key-jerry.ppk"
See http//twiki.org/cgi-bin/view/Codev/UnisonKeyS
etup
55- Notes
- You can get a copy of your public key at any
time - reload puttygen, pick the private key and enter
your password. - Create a launcher script
- Put this
- _at_plink mrjc.com -i "c\Documents and
Settings\Martin Cleaver\PuttyPrivateKey.ppk" -l
mrjc -ssh unison -server unison -server
-contactquietly - into a separate file, called plink-mrjc.bat and
invoke - C\moreprgs\unisongtunison.exe c\moreprgs\indigope
rl\apache\TWiki\data\Sandbox ssh//mrjc.com/home/m
rjc/cairotwiki.mrjc.com/twiki/data/Sandbox
-sshcmd plink-mrjc.bat - In my opinion this launcher script should be
completely unnecessary.
56References on TWiki.org
- http//twiki.org/cgi-bin/view/Codev/DataAndCodeSep
aration
57Timeline
2001
2003
2004
2005
2006
2000
2002
Joachims algorithm
My first comment On this topic
Sparks Syncs with Rsync
58Unison
Replicated slides
59Normal Unison setup
60Unison Profiles Example
- Unison preferences file
- batch true
- log true
- times true
- prefer newer
- servercmd bin/unison
- rshargs -i E\home\.ssh\identityU
- include ignore
- root E\home\working
- root ssh//swenson_at_heronetwork.com/working
- ----- cut here ------
- ignore Name ,.,.xvpics,.o,.tmp,tmp,temp,
.out
A fish has a memory span of 3 seconds, this
explains why they move a lot.
61Multi-configuration synchronizations with Unison
- Figure out and make Unison configurations for
which files / directories - Almost never change
- Change while working
- Change frequently
- Need to update when you move
- Create a non password ssh key pair
- Adjust authorized_keyscommand"bin/unison
-server",no-pty,\no-port-forwarding,no-X11-forwar
ding 1024 35 1..3 - Now make some simple scripts like
(umain) unison palm unison docs unison working
By law, it is illegal to eat oranges while
bathing in California.
62Other Sync methods reported at TWiki.org
63RSync One way sync for master-slave TWiki
- Rsync was used to great effect at Inktomi many
years back for a significant time period (gt12
months of active use) to replicate data between
sites, and worked very well. One limitation of
the approach taken was that you could only have
one master site for each web , but that was
resolved by simply making the edit and attach
links point at the appropriate master site for
that web. - Michael Sparks who got the rsync version to
work
64How to handle merged numbering?
- So if there's a revision history like this
- 1.1 1.2 1.3A 1.4A 1.3B 1.4B 1.5A then the author
of 1.5A will get a message with the deltas of
1.2-gt1.5A and 1.2-gt1.4B, with a request to merge
the changes into a new, common 1.6 revision. - What if the 1.4A author is lazy (or ill or was
run over by a bus) and ignores the request?
Assume somebody has a 1.4B version on his machine
and adds a change in the moment that his new
1.5B change is distributed, a conflict will arise
and the author of 1.5B will get a conflict
resolution request. - http//twiki.org/cgi-bin/view/Codev/ReadWriteOffli
neWiki - -- JoachimDurchholz - 22 Nov 2000
65Other tools
66RSYNC (v 2.4.6)
- Pushes new and changed files or directories to
remote machine - Free
- Unidirectional
- To get a kind of bi-directional transferUse
--update option (NTP!!) rsync -auvz /
othermachine rsync -auzv othermachine/ - Uses rsh (can use ssh)
It's impossible to sneeze with your eyes open.
67RFS vs. RSync vs. Unison
68Reconcile
? ?
- Mitsubishi Electric Research Laboratories
- Does everything Unison does (mostly)
- Still in research
- Requires you to become a collaborator to get a
testing copy of it
"Any sufficiently advanced bureaucracy is
indistinguishable from molasses."
69More File Synchronizer Tools.
- MS BriefCase(Microsoft), xSync (VBX System),
FileTiger (Science Translations Software)
File Tiger
High-Speed-Drive
70Filesystem Synchronization Tools
- rsync (v 2.4.6) http//rsync.samba.org
- unison (v 2.7.7 stable) http//www.cis.upenn.ed
u/bcpierce/unison - Cfengine (v 2.0.a14) http//www.iu.hio.no/cfengi
ne/ - Reconcile (internal) http//www.merl.com/project
s/reconcile/ - Other replication technologies
- Synchronizing FTP Files with Perl -
http//www.linuxjournal.com/article/6686 - Thread discussing some alternatives
http//www.misticriver.net/boards/archive/index.ph
p/t-1543.html, http//www.talkaboutshareware.com/g
roup/alt.comp.freeware/messages/362086.html - http//www.tgrmn.com/web/kb/showall.htm
A ducks quack doesn't echo, and nobody knows why.