Title: AToL Workshop Breakout
1Social Networking and Collaborative Tools
William Barnett Research Technologies, Indiana
University March 8, 2008
2What is a Social Network Service?
- Wikipedia definition
- A social network service focuses on the building
and verifying of online social networks for
communities of people who share interests and
activities, or who are interested in exploring
the interests and activities of others, and which
necessitates the use of software. Most services
are primarily web based and provide a collection
of various ways for users to interact, such as - chat,
- messaging,
- email,
- video,
- voice chat,
- file sharing,
- blogging,
- discussion groups, and so on.
3NSFs Goals for Virtual Organizations(Cyberinfras
tructure Vision for 21st Century Discovery)
- To catalyze the development, implementation and
evolution of a functionally complete national
cyberinfrastructure that integrates both physical
and cyberinfrastructure assets and services to
support VOs - To promote and support the establishment of
world-class VOs that are secure, efficient,
reliable, accessible, usable, pervasive,
persistent and interoperable, and that are able
to exploit the full range of research and
education tools available at any given time. - To support the development of common
cyberinfrastructure resources, services, and
tools enabling the effective, efficient creation
and operation of end-to-end cyberinfrastructure
systems for and across all science and
engineering fields, nationally and
internationally.
4NSF Building Effective Virtual Organizations
- 2008 BEVO Workshop
- http//www.ci.uchicago.edu/events/VirtOrg2008/inde
x.php?pgmain - Best Practices and Success Stories
- Infrastructures and Technologies
- Project Management and Organization
- Fostering Ongoing Collaboration
- NSF Virtual Organization Funding
- Cyber-Enabled Discovery and Innovation (CDI)
- Sustainable Digital Data Preservation and Access
Network Partners (DataNet)
5Research Virtual Organizations Phase 1EXAMPLE
NEESGrid Cyberinfrastructure to facilitate
simultaneous multi-site earthquake engineering
experimentation/simulation (B.F. Spencer, U. of
Illinois)
- Tele-control Tele-observation
- Electronic Notebook
- Advanced Data Metadata
- Remote Collaboration
- Core Grid Services
- Security
- Simulation
6Research Virtual Organizations Phase 2EXAMPLE
TeraGrid high end compute, data and
visualization resources to the nations academic
researchers (JP Navarro, U. of Chicago)
- Make science more productive through an
integrated set of very-high capability resources
(ASTA Projects) - Bring TeraGrid capabilities to the broad science
community (Science Gateways) - Provide a coordinated, general purpose, reliable
set of services and resources (Grid
interoperability working group) - 9 resource providers of supercomputers, high end
storage systems, visualization hardware, very
high speed backbones - Coordinated authorization and allocation process
(Globus) - Science Gateways (20) of Web portals or
applications that serve specific communities or
projects - Management model as single, integrated facility
7Research Virtual Organizations Next Gen Human
Cyberinfrastructure (HCI) of Web 2.0
technologies, user-defined mashups, Semantic Web,
(aka e-Science in Europe)
- Self-published scientific workflows (eg.
MyExperiment) - Cloud Computing (temporary virtual machines or
storage) - Organizational Hubs (eg. NanoHUB)
- Grid Computing Environments
- others?
- The Goal is to transform science and lead to
discoveries that otherwise would not happen.
8Community SitesReinforces weak social linkages
use existing or build your own?
- LinkedIn (http//linkedin.com) professional
networking, contact management, information
resource, job seeking - Facebook (http//www.facebook.com) personal
social utility gossip, chit-chat, social
coordination, fun toys - MySpace (http//www.myspace.com) personal
communication personal self expression, gossip,
social coordination - MyExperiment (http//myexperiment.org) scientist
created shared workflows, research collaboration - Ning (http//www.ning.com) tools to create social
networks - Club Penguin (http//clubpenguin.com) for the
next next gen
9MyExperiment Self-Published Workflowsa
community social network, a market place, a
platform for launching workflows and a gateway to
other publishing environments
- Scientists create workflows, hence have
intellectual control - Coordinates services and links resources
- Build once, use many times. Allows sharing,
re-using, re-purposing - A repository of experimental workflows
Generic protein sequence analysis performs an
homology search followed by multiple sequence
alignment and phylogenetic analysis. By M.B.
Monteiro
10Content SharingLow barriers to contributed
content, community enforcement
- Flickr (http//flikr.com) post, comment, tag,
rate, and organize photos - YouTube (http//www.youtube.com) post, comment
(blog), tag, rate, video clips - Googledocs (http//docs.google.com) create and
share documents, including web-based word
processor and spreadsheet - SciVee (http//www.scivee.tv) science
pubcasts, blogs, and discussions (part of PLoS)
- Scribd (http//scribd.com) library of
self-published documents - Wikipedia (http//wikipedia.org) self-authored
encyclopedia driven by self-professed authorities
(and Colbert. For example, see
http//seek.ecoinformatics.org
11Real Time CollaborationVideo, Audio, and data
conferencing
- Skype (http//skype.com) network-based voice and
video calling - Polycom (http//www.polycom.com) high-end
multipoint videoconferencing, including HD - Office Live (http//www.officelive.com)
Microsoft shared personal productivity
applications and files - WebEx (http//webex.com) Internet-based
multi-point collaborations, including application
sharing - Adobe Acrobat Connect Professional
(http//adobe.com) web-based collaboration and
document sharing, Flash-based
12ToolsContainers for content Management and
Productivity
- Drupal (http//drupal.org) collaborative content
management - Joomla (http//www.joomla.org) open source
content management system for website
construction (used by NanoHUB) - Moveable Type (http//moveabletype.org)
user-based content editing of web sites using
blogging tools - Programmable Web (http//www.programmableweb.com)
APIs that allow users to integrate multiple
functions - Doodle (http//www.doodle.ch) online polling
utility to coordinate meeting times - SurveyMonkey (http//www.surveymonkey.com) online
surveys
13Tag SharingUser-based content characterization
and grouping (contra librarian based
metatagging)
- Del.icio.us (http//del.icio.us) a social
bookmarking web service for storing, sharing, and
discovering web bookmarks - Digg (http//digg.com) tagging and voting for web
content - Connotea (http//www.connotea.org) online
reference management and sharing for scientists,
researchers, and clinicians - Citeulike (http//www.citeulike.org) online
service that organizes and shares academic papers
through tagging
14Cloud Computing Backend computational resources
for research communities
- Server virtualization permits the ability to
create temporary, task specific resources. - Allows personal infrastructures.
- Amazons elastic compute cloud (EC2) and Google
- Amazons simple storage service (S3), Google, and
now Microsofts Skydrive - Can be configured as a data or compute back end
to community sites. - cf. the Google-IBM Academic (NSF) Cluster
Computing Initiative at - http//www.google.com/intl/en/press/pressrel/20071
008_ibm_univ.html -
15Cloud Computing
- Hadoop EC2 S3 Super alternatives for
researchers ( real people too!)I recently
discovered and have been inspired by a real-world
and non-trivial (in space and in time)
application of Hadoop (Open Source implementation
of Google's MapReduce) combined with the Amazon
Simple Storage Service (Amazon S3) and the Amazon
Elastic Compute Cloud (Amazon EC2). The project
was to convert pre-1922 New York Times
articles-as-scanned-TIFF-images into PDFs of the
articles - Recipe 4 TB of data loaded to S3 (TIFF
images) Hadoop ( Java Advanced Imaging and
various glue) 100 EC2 instances 24
hours
11M PDFs, 1.5 TB on S3 - Unfortunately, the developer (Derek Gottfrid) did
not say how much this cost the NYT. But here is
my back-of-the-envelope calculation (using the
Amazon S3/EC2 FAQ) EC2 0.10 per
instance-hour x 100 instances x 24hrs
240 S3 0.15 per GB-Month x 4500 GB x
1.5/31 months 33 0.10 per GB of data
transferred in x 4000 GB 400 0.13 per
GB of data transferred out x 1500 GB
195 Total 868 -
- Glenn Newton (2/27/2008) _at_ http//zzzoot.blogspot.
com/2008/02/hadoop-ec2-s3-super-alternatives-for.h
tml
16Organizational HubsEducational analysis and
collaboration tools (eg., NanoHUB)
- A structured compendium of educational resources,
particularly simulations -
- Online presentations, courses, learning modules,
podcasts, animations, teaching materials, etc - Infrastructure is transparent to users
- HUBZero software (Purdue) to be released in 2008
built on Joomla, using Rappture for applications
Molecular dynamics simulation using the BioMOCA
simulation program at NanoHUB.org
17Grid Computing EnvironmentsWeb portal interfaces
and user-centric services to the Grid
- May be moving away from high barrier resourcing
and authorization schemes such as Globus although
there is still a focus on a limited number of
transformative or big science programs. - Evolution would be to social networking models
for access to resources either through community
created portals (eg., Joomla) or the use of
commodity portals like Facebook. - Back end computational resources may coalesce
into a national cloud structure or disassociate
into multiple, independent clouds. -
18Opportunities and Challenges of Social
NetworksHow do you create and sustain momentum?
19Organizational Behaviors of Virtual
OrganizationsWilliam B. Rouse (Health Care as a
Complex Adaptive System)
- Roles based on Leadership rather than Management
- Management through Incentives and Inhibitions
rather than Command and Control - Measurement by Outcomes rather than Activities
- Focus on Agility rather than Efficiency
- Relationships based on Personal Commitment rather
than Contracts - Organization structure is non-heirarchical
- The design is based on self-organization
20AToL Social Networking Goals?
- Goals of AToL Digital Library of Biodiversity
Information Grant (Maddison, PI) - Improve core scientific content of the ToL
collection - Implement new technical features focusing on
needs of users from the education and research
communities. - Initiate collection of content specifically aimed
at K-16 learners - Develop and implement robust policies pertaining
to the administrative structure of the ToL
I still have not found the details of the first
page after 1 hour rummaging through your web
site. It needs to be simplified before it will be
widely used. I am a retired researcher who has
time to search but even I had to give up because
of difficulties. I could have added to your data
base if I could have found out how to do it. A.
A. Berryman Comment on First chapter of book of
life goes live, Nature.com 26 Feb, 2008
21Broader AToL Collaborative Goals?
- Share Sequences hot off the sequencers data
(note impact of high throughput sequencers and
data management needs) - Bridge gaps for AToL projects that do not have
Informatics components - Communications challenges data sharing,
day-to-day communications, sharing among
different projects - Information Repositories
- Interoperability standards, workflows, and
architectures - Extending results and outcomes to greater science
and educational communities
22Broader Impact Goals
- Transformative science (lead to new discoveries)
- Democratization of science (lower barriers of
entry) - Build for the next generation of scientists
- Expand roles as scientific authorities and
authentic science
23Making this breakout productive
- What are the goals of and tools needed by AToL?
- Activity Map significant activities and
organizational processes of the AToL community
onto existing offerings (eg., a requirements
traceability matrix) - Goal To come out with workshop or other
proposals to advance AToL collaboration
- Thanks to the following people for help with this
presentation - Reed Beaman, Florida Museum of Natural History
- Geoffrey Fox, Pervasive Technology Labs, Indiana
University - Rick McMullin, Pervasive Technology Labs, Indiana
University - Mark Notess, Digital Library Program, Indiana
University - Marlon Pierce, Pervasive Technology Labs, Indiana
University