Archive for May, 2008

JA-SIG 2008 Wednesday Notes

May 6, 2008

Check JA-SIG page for presentation uploads.

Session: Functionality Mashup, Charles Severance, Univ. Michigan

Charles Severance, u michigan worked at ford motor company and was head of Sakai project.

U Michigan medical school put out rfp for student-cntered learning management system..rfp was very good doc. Integrating personal goals & lifelong learning w/ institutional goals & competencies. Portfolio as content mgmt system for students.

Sakai is institution focused, moodle, is teacher focused, facebook is student focused.

The software must compete with facebook.

Michigan will make sakai work w blackboard & sharepoint

Charles started looking at personal systems. looked at Nate Angell’s blog.

Looked at moodle portfolio, elgg social software, mahara,–he likes mahara,

functionality mash up ability to pull a program into another program under end user control. Not just read only. includes identity, context, roles, data interchange.

Facebook is model for mashup

wwwbodington.org lms . check out www.dr-chuck.com/media.php?id=80

Changing a Learning Management System (lms) is too difficult for a university. Alternative: make the change continuous and under faculty control w/ plugins and mashups.

Student mashup inspired by facebook

Intructor mashup inpsired by youtube

SiteMaker : www.gvcsitemaker.com

Sitemaker is a quickie site developer tool, open source. maybe only for apple?

Made sitemaker create a website for sakai.

demo : www.imsglobal.org

Sakai/ims google summer of code on : woking on making sharepoint work w/in sakai. Same for moodle. Working on sakai presense.

Design heading toward facebook/mahara as student-centered point of view

igoogle-like interface to sakai.

moodle 2.0 roadmap. check it out.

CLOSING GENERAL SESSION, Kaye Howe, National Science Digital Libraray

Prometheus : the cost of stealing an intellectual tool

Odysseus : the power of intellect : the trickster vs. the wise

Faust : the power of intellect and the temptation

In ancient days only a very few people had access to knowledge. Only in Enlightenment did this start to open up. The first encyclopedia .. to open knowledge. Collaboration and dissemination, pushback from govts. In the preface to the encyclopedia they were concerned about the return of the period of darkness, they lamented that sometimes the periods of light were so short.

American Enlightenment : correspondence of John adams and jefferson. If there is one civil religion in America, it is education, public education.

Education is the grace of a fallen world. If we did not have to improve ourselves, we would not need education.

Wallace Stevens : poet . the poems of our climate

INTRODUCTION TO SAKAI WORKSHOP : Anthony Whyte, U. Michigan

Check presentation out here.

In Japan very interested in mobile device access. In So. Africa they do not have pcs, so need to access via phones. Sakai is mobile oriented design.

Features

  • Course mgmt
  • Ad hoc work groups
  • Research collaboration
  • Portfolios : 2.50
  • Sakaibrary : you can search w library search engines to bring citations into sakai sites. tools for publishing subject research guides. authoring interface for librarians.

Focus is on collaboration & learning environment, as opposed to course mgt system

  • whitman college: comm & share info in a collaborating learning environment support staff and faculty
  • it is a core of a number of apps

sakai will be working w fluid. accessibility is a huge priority.

  • blackboard is not accessible.

Top priorities: accessibility, ux, better faculty facing content offering course presentation interface.

Next big release in fall and once a year in fall. release is maintained for 2 academic years.

Development community comes from academic institutions.

Implementation roles: proj mgr, technical lead, sysadmin, dba, developer, qa, ui designer, instructional designer, technical writer, enduser support, trainer.

High end staff staffing: 14 fte

Low end staff: 1.85 staff

Commercial affiliates rsmart & unicon

Move to push java out of front end and move ajax in, therefore easier to customize…

You can drop a sakai widget into facebookor put sakai widget into mac os x desktop

U Cambridge

  • drag and drop portlets
  • have written ajax presentation widgets
  • ability to change layout, add or remove widgets. jSON format webservices
  • cambridge & fluid teams working on accessibility, a difficulty w ajax
  • they have sakai running on a iphone, have a sakai pda interface
  • skinmanager.edia.nl

gradebook.

  • umichigan has more than one set of gradebook. gradebbook svc called from assignments tools. at the very least you can create it in excel ssheets & upload them. faculty want calculated columns.

setting up learning outcomes for students. not all tools have it, but there is a “goal awareness” that is attached to activities, driven by portfolio people.

there is a wiki tool that will track the percentage of particpation of each student and it is being integrated into sakai.

move toward ti2 (?) development standards so tool developers can write to the standards so that they interact with all the lms options out there.

working on a sakai facing version of sakai.org. currently it is joomla.

also under development : news feed aggregator

go to www.planetsakai.org that aggregates sakai related blogs.

ohloh.net : listing of osource committers.

check on how broad the contributor base is, what they are contributing

a dozen schools that contribute about 25 developers

coolest sakai project : rhode island k-12 eportfolio

  • 13000 course sites feeding from an SIS (student info svc?). handles add/drops, moving students.

la guardia community college is national leader in portfolio development and are working with sakai.

Univ of dayton volunteered to do quality assurance work in order to bootstrap themselves into portfolio initiative.

U Michigan Melissa Peet (sp?) works on collaborative teams…?

  • They have a published set of process rules, loosely based on apache’s system.

Changes in the community : direction of sw development enhancements or repair based on interests of community at large. If there is a unique requirement, there are vendors who could make the change for you. Since it is all open source, if the requirement proves useful, then the code could be integrated into the upgrade path of of the application.

Zemba is a commercial entity wrapped around open source. Changes decided on by votes.

Support for institutions reporting beta problems is very strong because they are on the line and everyone needs them to succeed.

Newest release is 2.5 right now.

Ohio State 50-55,000 students. 15-20,000 courses. currently running on sql server. Peoplesoft/sakai connector available.

Sakai jira bug tracking site. sakaiproject.org/jira/…. most popular environment is apache/tomcat

One institution is using sql server. mysql & oracle work well either way. Outside of US predominant database is mysql. Oracle predominant in large institutions.

Georgia tech: 20,000 students, 4.5 servers, ½ hr downtime every saturday, 24/7 operations crew for support

plagiarism tools, e.g. “turn it in” avail as a custom integration but there is an api in a trunk.

illuminate & wemba are working on integration.

sakai.org will have a list of tools for integration.

Note from Anthony Whyte:

On DSpace: the Univesidad Politecnica de Valencia (ES) is integrating Sakai and DSpace by connecting DSpace with metalib and using the Sakaibrary to retrieve citations from metalib. You should be able to get more information from Raul Mengod Lopez (rmengod@cc.upv.es) regarding their progress.

There is a Sakaibrary demo athttp://sakaibrary.umdl.umich.edu

Click on “free instance”. Instructions for logging in are available on the gateway page. I should note that this is a Sakai 2.4 version of the Citations Helper.

JA-SIG 2008 Tuesday’s notes

May 6, 2008

jasig tuesday

General Session: Moving Community Source to the Mainstream, Ira Fuchs, Mellon Foundation

Educause awarded u-portal as ground-breaking open source project.

Legal, economic, psychological impediments to open source

why doesn’t every campus use open source or participate in development?

Why is growth needed :

  • communities are organic, so growing = life
  • more usage=greater synergies, more feedback, more user based support, funders get more social returns, contributors get more ego boosts, vendors get attracted & finds way to serve communities. growth begets growth & reduces overall costs.

Obstacles to growth:

  • Legal obstacles : witness issues w blackboard(?) community-source (c-source) is subjected to double layer of risk assessment, e.g. Intellectual Property (IP) infringement. Fuchs refutes this, campuses are always challenging IP edge w/ standard research. Strategies: 501c3s to hold copyright & get universities out of line of fire.
  • Need for professional support: as more institutions outsource key services like email, IT spt may be shrinking at institutions. Vendors are critical to IT @ universities. Many of the significant c-source projects were vendor-supported. e.g. vendor tech spt services.
  • Real cost of c-source ownership: we need to show real costs of implementation, there are enough 2nd and 3rd gen adopters to help identify real costs. Fuchs is confident the costs are still competitive. c-source provides institutions w greater agility.
  • Open-source (o-source) as anti-commercial movement: Some concern among top leaders of academia. However, big companies use open source sw. Academia can be just as professional as business.
  • Wealthier institutions have an obligation to help less-wealthy institutions: o-source follows this model. Not all institutional leaders share this view. We need to work as a community to show the pay-offs to large institutions
  • Perception & marketing: osource risks are higher, bennies lower. Clear, persistent communications needed. We must communicate value of participation and collaboration both horizontally & vertically. Software project wiki sites need to provide info to senior executives. Inadequacy of marketing materials is why osource has not penetrated. Marketing cannot be relegated to vendors only. zotero has marketing done well. zotero is a firefox plug-in. Why shouldn’t academic institutions help out with other project needs like marketing? e.g. multilingual marketing? Shouldn’t open source and community source communities value this kind of contribution?

Projects to note:

RIT-Space (avail on website)

OpenCollection:

  • any museum type, any metadata schema, multinational consortium, collections to full fledged MIS. Mellon grant. Museum of Moving Image in New York and Berkeley leading project, using SOA (Service Oriented Architecture). First design workshops.

VUE:

  • vue website: concept mapping. rich connections to external resoruces. network analytics, filtering tools. pathways and presentation spt.

Sophie:

  • eBook authoring. rich media, timelines, annotation, interactive conversations via sophie server. can use flash, pagemaker, quark express. created a book online, 10 pages per hour.
  • try to locate sophie : poema de la siguiriya gitana

Zotero:

  • Citation manager, bookmarks on steroids, integration w/ internet archive=permanent citations for web resources. API: plug-ins for the plug-in (e.g. Vertov). George Mason Univ. funded by Mellon. Now working on zotero server for sharing citations. Internet archive will assign permanent uri to whatever site you want. IA makes it available through own search interface for use by others. zotero vertov (sp?) plugin will enable you to cite a video and determine in and out points and add metadata.

SEASR:

  • rich media analytics for humanists and artists. laptop to grid scale. automated marshalling of resoruces. components, workflow, soa/web servcs.

Bamboo:

  • Chicago & Berkeley leading. Shared tech svcs for arts & humanities scholarship. Community design process. SOA. By using SOA environment, hoping for greater applicability and sustainability. Maybe connect with MIT simile, fedora project, standards based annotation sytems. First round of bamboo workshops underway. projectbamboo.org.

Synergy among these projects are critical for every project’s sustainability.

To succeed, need to market this to campus leaders. benefits minus costs.

Session: If we build it, will they come? Cornell dspace , George Kozak

6 programmers, web designers. Topic of presentation : promotion of their repository

Cornell’s history:

  • Original Fedora work
  • digital preservation aDORe
  • physics arXiv

DigitalCommons@ILR

CUL Media Archive using Fedora & Ruby

Funding for deployment & maintenance of dspace thru foundation. Operational responsibility funded by grant.

Dean of Faculty’s dream: Open access for faculty. Create 193 communities. Grad Ofc: ETDs. Students offered print on demand services in exchange for voluntarily submitting.

Code enhancements by Cornell info tech grp:

  • quick submit program
  • view counter for items
  • Offered to load materials & provide metadata or tech spt
  • Other selling points: guaranteed open access, google harvesting, harvesting, guaranteed storage & web access.

Position of assoc univ librarian for scholarly communication & collections was created. AUL set up an IR team of librarians & tech staff.

Several upgrades.

  • rebranded site.
  • removed empty communities
  • the new paint attracted a lot more attention.

More requests for inputting items came in.

Physicists wanted archives dept to store their videos. They submit video into their repositories. They are starting streaming videos.

Some collections: they provide CDs of content when desired.

Archived some websites that would no longer be sustained using httptrap (?) sw.

Cornell will be using Sakai and Dspace to make content available through Sakai.

Focus on providing materials that were “losing their home” or previously unavailable on web.

Working with grad school to mandate e-submission.

Other avenues:

  • harvesting our domain
  • works seeking publisher
  • local communities

Size of repository tripled in one year. Many items added though batch loads.

Tracked increase in # of hits. 70% from robots. non-robot hits were around 130K to 150K per month

40% of downloads from robots.

George tracks hits and downloads, w and w/o bots. Tracks it with year and items.

Tracked unique IPs, majority from outside univ.

Problem: How do you balance quantity vs. “scholarly”?

Library controversy about community content. Is it ephemeral or should be preserved?

Storage and network transmission costs are significant.

How do we measure success? hits? downloads? # collections?

Need for fulltime funded staff for outreach & recruitment.

They took out self-registration because people got in and wanted access to closed collections.

Kozak is working on a white paper to convince people why a repository is needed and why it needs staff.

Their new university librarian is Ann Killian (sp?)

They are getting 2 new servers.

Another Dspace project, OpenPolicy, is thinking about charging for statistics. Kozac does charge for conversion to pdf. They are working on the statistics issue. Using Google Analytics.

Embargo code is from University of Maryland and being used in ETD collection.

National Library of Medicine is highest.

dspace.org is using Google Analytics to track their stats.

Work on submission form to make a quick submit routine was not fruitful. But got feedback that licence page would be best up front.

ETDs also being sent to Proquest. Workflow will be students submit to ecommons & library forwards it to proquest.

Students like the print on demand service, library will print & distribute to committee members.

A SHORT CONVERSATION WITH MICHELLE KLIMPTON

Statistics is a committer priority as of OR2008. The committers agreed they will take UMinho statistics module and integrate it for future release of Dspace, manakin and (?) jsp-ui. Minho patch works with 1.4 only, not 1.5. Temporary solution for manakin: google analytics javascript code in web pages. Will not show closed collection views, tho’. Mark mentioned Google Analytics presents some processing overhead and cost issues, too.

CONVERSATION WITH LIBRARIAN FROM WOODS HOLE

Copyright issues: Sherpa can be misleading. Must read actual license agreement. Publisher may define “self-archiving” as on to author’s pc or a storage mechanism completely under the author’s control.

To get publications into her DSpace, she programs zip code query in major database feeds, get automated reports on what is getting published in her neighborhood, then contact the authors for preprints, NOT published versions. Word docs qualify as preprints. Converts to PDF b4 loading up.

Session: FLUID: FLEXIBLE USER INTERFACE DESIGN

Adaptive Technology Resource Center, U Toronto, world leader in accessibiliity and usability

Cross-project collaboration

Share UX (User Interface) resources across projects, solve common challenges, recognize recurring user interface idioms and needs

How do non technical people get involved in OSS? How to do distributed user testing.

Reusable flexible rich UI components,lightweight javascript,great interaction designs

  • UX toolkit
  • UI design patterns
  • UX walkthroughs
  • testing techniques
  • user profiles

what you need to design great user interfaces

  • components=recurring interactions
  • common activities uploading, finding, navigating thru content and tools, drag & drop
  • activities & contexts
  • UX walkthrus
  • checklists, pain points, solns, techniques

U-camps or user camps provide

  • basic UX vocabulary, techniques
  • OS distributed usability testing, competing with svcs like Morae
  • surveys, screen recording, keyboard tracking, etc. VU lab to be released soon.

UI design patterns

  • pattern=proven soln to a common problem in a speciffied context.
  • first open source pattern repository
  • share patterns across communities
  • www.uidesignpatterns.org

Goals:

  • make it easier for developrs to build better, more accessible user interfaces.
  • support collaboration w designers
  • foster sharing of design and code
  • adaptable for variety of toolls & workflows
  • diverse presentatin framewoks

fluid component:

  • client side: html, style sheets, javascript, accessibility metadata
  • server: ability to respond to RESTful requests (get, post), ability to deliver appropriate markup and data

UI adaptaton

  • flexible layouts & linearization: switching from multiple to one column
  • enhanced nav aids: turn on/off sitemaps, breadcrumbs
  • keyboard support: shortcuts, navigation
  • work based on jQuery

fluid components built to work with portals. support for multiple instances. dom searches constrained to fragments.

Fearless javascript workshop wiki.fluidproject.org/x/71Mk

fluidproject.org,wiki.fluidproject.org

other dhtml toolkits with accessibility: dojo. Will be incl. in jquery release.

Graceful degradation issue. Possibility of over dependence on client side javascript. They are very interested in open source renderers.

simile widget like timeline great but not accessible.

Session: OPEN SOURCE LONG TERM PRESERVATION ARCHIVES : Richard Matthews, Sun Inc.

Richard working on Honeycomb project

Sun Microsystems’ commitment to open source has strategic goals : increase core developers. This results in more partners, more awareness of trends. Also larger user community and funding support.

Solaris is open source software (oss). Commitment to port utilities to solaris. opensource.org.

Sun xvm is version of vmware

Sun announcing completely open platform including apache, php, ubuntu, synopsys, mysql, opensolaris & opensparc: hw & sw

www.opensolaris.org : site for new stuff

dtrace : open source debugger

www.sun-pasig.org May 27 will have more info about preservation archiving. sun preservation archiving community.

Reasons: compliance, book & image sharing, national heritage content, newspapers, data, applications & systems, journals, born digital, tiered repositories.

Proposed soln: fedora front end plus sun honeycomb

sam-qfs project : best policy based multi-tiered archive manager. Application transparent dynamic data movement, 4 tiers, local & remote, continuous archive=cdp, WORM & retention mgmt

Infinite archive system: scalable multi-tiered SAM-QFS, platform base, 10-256 TB systems, data-in-place upgrade

Tier 3: tape archive tier 2: disk cache tier1: disk archive

Serving Library Of Congress, petabyte of data, and Dept of Defense customers.

Sun storagetek 5800 Honeycomb

Smart, network attached, clustered, racked storage system

Metadata awareness built into design of box & data layout on disk

Open system, open source sw

RAIN architecture based on “cells” disk architecture

  • l2 load spreading switches
  • service processor
  • ea node Opteron-based SunFire server

Honeycomb

  • architecture optimized to store & retrieved unstructured fixed content. object storage, metadata aware.
  • dublin core metadata
  • web dav
  • future: xam (metadata & query model).
  • extreme data protection via RAID6. mean time to data loss > 2M yrs
  • demoe’d running video and pulling out disks at same time.
  • standard java & C APIs in SDK
  • horizontal scaling
  • dublin core is only beginning
  • platform agnostic

near future: onboard local data services available (’storage beans’)

why fedora & honeycomb?

  • to address scalability need fedora as aggregator of many different repositories.
  • designed w proper intelligence in proper places; metadata integral to storage, world-class reliability, persistency and scalability; end-2-end oss, automated wide area backup option

Storage beans

  • discrete services inside Honeycomb. example apps:
  • asynchronous background jobs : transformations (take all MP3s and turn into MP7 files : remastering of your files), periodic data scrubbing, duplicate consolidation (de-duping)
  • synchronous: audit logs, watermarking, encryption

Fedora now runs on Solaris/Open Solaris

  • Server + storage reference configs
  • inclusion of fedora 3 in Open solaris as ‘Indiana’ repository
  • Fedora on Solaris
  • John Hopkins now using this config.

eresearch, preservation archive, publishing going into fedora commons and going to fast disk, honeycomb, tape.

Can control views of metadata, e.g. blot out privacy data in mri file. Handles embedded EXIF data.

www.sun.com/storagetek/disk_systems/enterprise/5800/index.xml

www.sun-pasig.org

storagetek/management_software/data_management/sam/index.xml

www.opensolaris.org/os/project/honeycomb

Questions/comments from audience:

issue with the tremendous number of filehandles for tiny bits of data. 5800 was designed for large data files.

issue with headroom on each node so you needed a lot more storage than advertised

JA-SIG 2008 Monday’s notes

May 6, 2008

JA-SIG monday

Check JA-SIG page for presentation uploads.

Check out DSpace’s Michele Klimpton’s talk.

Opening Session

50 percent first timers here

Keynote panel : introductions to software projects by representatives from dspace, fedora, fluid, internet2, kuali, sakai

Internet2

original idea was that commercial sector would not be able to sustain internet for academic needs

middleware initiative to foster research mission

federated login? shibboleth, campus id svcs, grouper, signet,for collaborative tools.

Fluid

cross-project collaboration. usability focused. allow users to focus on their tasks. virtual usability lab for remote testing.

Sakai (UH’s Laulima)

use cases beyond classroom spt. e-portfolio, research collaborations. sakai-ware integrates library resources. moving to community of enterpprise deployment. concerns abuot having sw used by 10s of 1000s. reqmts from technical, usr spt, and end user communities. looking at other svcs that could be integrated (do not reinvent wheel), e.g. calendar, java document storage, etc. also workflow. looking at fluid for usability, both end user and developer.

Kuali

where banner, sap, peoplesoft have been. koias (?) by mit for research administration also being extended. kuali rice : developer tools. similar to sakai, being rolled out. sustainability issues.

Note to folks : UH ITS is looking at adopting this product.

fedora commons

flexible, extensible, dig object repository architecture developed out of cornell. maybe oldest of projects. carl lagozi involved in 2001. may have been 1st open source project funded by Mellon, which also funded kuali & sakai. Includes oai, semantic technology integration, ingesting.

Not so focused on institutional repositories, which is dspace’s focus. Fedora Commons is about preserving, archiving, re-use, data resources. Fedora has organized as a 502c3. There are 5 open src projects under fedora commons: topaz proj, cougar project, mobera(sp?). 127 known installations of fedora. 4 yrs to come up w/ sustainability plan. data curation, open access policies, etc.

dspace

Started in 2000. Enable researchers to have broader & permanent access to their works. over 300 institutions. 25% in US. very active abroad. no centralized dev. over 100 active developers. 14 lead developers who take community code and roll it into core source. primarily used by libraries, 70% of use cases libraries. Others: corporations, research orgs. Can use it out of the box. Can ingest, search & browse, can manage workflows, features storage asset system, some preservation tools.

New executive director Klimpton’s background is non-profit, not academic. It was getting difficult to coordinate priorities & needs and how that was being filtered to development team. Copyright to software used to be held by hp & mit. Moved copyright to dspace foundation, which would be committed to keeping copyright open.

Strategic objectives: work w/ community to advance platform. Michele hired in april. Priority is to advance the tools. 1.5 released in OR2008. It doubled size of code, w/ 30 developers worldwide. Next large release being planned, with goal for more modularity where people can choose tools such as assetstore. fewer dependencies. Building partnerships & joint dev projects that can address community at large. Zotera & citation mgmt. partnership with Sun and Honeycomb assetstore system. NITLE is a non-profit working to provide repository services to smaller academic institutions.

jasig

Original mission to provide a gathering place for java users in academic spaces. Early on created commercial partnerships. Mellon funding. Central Authentication Service (CAS) created by Yale. u-portal project started. jasig formed 501c3 and looking at sustainability. Release of up3. Interested in community source mgmt. Managing user requests & sustainability.

Session: USING FEDORA TO MANAGE STUDENT DOCUMENTS

Coldfusion front end

Simple config: object and multiple datastreams

Complex config e.g.: header object : objects. dublin core at both levels.

This application: Admissions Office. New student process generates tiff pages.

Student document : RELS-EXT (relates child objects to parent object), tiffs, pdfs, dublin core. Tiffs converted to multipage pdf. People have not asked for for OCRd PDFs.

Naming convention: if datastream beings with T, then it is a tiff.

system functions:

  • display a folder
  • view a pdf
  • create a new folder (newstudent record)
  • create a newdocument
  • add a page to an existing doc

cornell uses kerberos identity credentialing software and needed to be integrated w/ fedora.

“adding page to an existing document” tool simply looks like a tiff upload. Cornell scanning everything to tiffs and uploading to folder.

Encountered some problems with batch uploads, esp. with big pdfs. The tiffs upload one at a time thru user interface. Pressure to perform faster. This implementation is only tiff upload. Browse to image, upload to coldfusion server. Created a new dsid (datastream id?), using getNextPid method.

Create a new document: doc types, e.g. independent study.
getNextPID, Preparefedora object, pull in FOXML template (fedora xml), make substitutions, ingest.

When creating a new document, system checks for student id and provides student info.

Includes external webservice utility to convert multiple tiffs into one multi-page pdf. Also delete page, delete document, regenerate pdf. Pdfs sometimes get corrupted. Regenerate pdf tool to allow user to regenerate pdf. This brings up some digital preservation issues. sometimes pages are missing from the pdf.

Using dc.title for student name, dc.date for date of birth. Yes, not conventional, was an expedient decision.

When user looks at document page listing, tiffs are named as coded, e.g. t0004. No indication of contents of that tiff. I guess it depends a lot on looking at the pdf, which would be the compilation of the tiffs. Maybe the numeric sequence tells acrobat the order of the tiffs…?

Session: BIRDS OF FEATHER SEMANTIC WEB

Check out simile.mit.edu

Cornell: Vivo : relational db. You can go to any person at Cornell and pull up relationships for that person. You can search on bee keeping and get search results of researchers. Includes bios and publications.

Then people wanted filtered view of this, e.g. look at just life scientists. sounds like a fresh database. They have been looking at moving to a new framework, took 2 programmers. used HP jenna framework that handles triple store.

People can login via kerberos and update their own pages. Colleges are asking for reporting mechanism, e.g. faculty research focus or what they are saying about their courses.

Ability to generate pick-list of graduate students.

Ability to generate research teams.

Another project looking at amazon-like “if you like this, then you might like…”

Exposing data on the web, maybe using rdf to include metadata in page.

Cornell global access has a cute interface… includes geographic data .

Dspace did a rudimentary crosswalk to getdata and expose via rdf to oai. using longwell. Mark Diggory is looking for a way to express the same user experience thru the ui that the machine can see. Goal to get the metadata resolvable in the browser so user can have views of that content and the user can pull in the content and use it locally. Lot of this work seems to be focused in the ORE space. Mark feels there is vacuum. if you are putting the information out there how do you discover it…do you put all these complex relationships and put them into a digital object? The rdf can explode. You will need tools to determine how much content you want to expose. This is Mark’s problem space.

ORE is looking at an rdf model for scholarly publishing and versioning. rdf is a version of METS but much freer.

Session: NITLE

dspace.nitle.org homepage :listing of institutions as communities.

  • can set up dedicated instance, also
  • about a few thousand $$
  • provides backup services

Most of their clients had backlog of digitized collections.

Some inter-institutional projects.

Featuring student scholarship

Librarians are looking at benchmarking student publishing.

They have a federated ldap system something like shibboleth. It will ask where you are from, check you against the campus ldap. You get group rights from campus ldap.

Each campus has administrator level rights, so there is a lot of trust.

Google analytics tracking by community & community hierarchy

Offers dspace training, repository manager focused.

Training uses multi-point video conference for training

Peer support & community, annual user community mtg in june 2008. Virtualuser community meetings. foster peer-support among campuses.

dspace technology fellowships, hosting events, share results of experiences

A campus can participate in just user community or in just training services

Services in: dspace, Multipoint Interactive Videoconferencing, and Sakai and Moodle

NITLE would be interested in Sakai/Dspace integration

Community source: higher ed should own its own technology innovations; innovation should exist in a virtuous cycle with research & teaching needs and practice; aggregating resources effectively behind open source products and community is our best option.

Software as a service model. e.g. google, flickr, basecamp, etc. most higher ed comm source products do not match this model. Helpful for smaller organizations and larger orgs exploring internal SAAS service models or other models. Can it address the “flood” by creating opportunities to scale implementations?

nitle.org

Experimental site : http://markets.nitle.org : sign up and trade

Some size limits

Batchloading consulting svcs.

Sword and lni developments will look good.

nitle owns the handle system… so moving collections from nitle’s shared instance to individual dspace instances might be an issue, but so far it is not a difficult area.

Big area of interest is streaming media—tim donohue is working on this

Nitle serves non-profits non-academia

Nitle is subsidiary of Ithaca Harbor, a 501c3 that is an incubator for helping non-profits leverage IT technologies.

Service is about a year old.

Session: IMPLEMENTING OPEN SOURCE PORTFOLIO TOOLS IN SAKAI

Kenrick Chun at UH ITS might be using OSP in Sakai, according to presenter, Janice Smith.

Open Source Portfolio born at Univ. of Minnesota & recently made open source. rSmart is company working on this project.

CLE = collaborative learning environment – rSMart’s contribution to sakai

They do some sys analysis for sakai implementations of portfolio

Teaching students how to manage their virtual identity.

Has a resume wizard. gathers structured data in xsd format. Categories such as activities, relevant c ourses, honors, experience, travel, urls, etc. System generates a resume view. You can select a variety of views. There are some institutions looking at using this.

Next archtype : portfolios for teaching & learning. Institutions can construct matrices to structure student learning in relation to learning outcomes.

Faculty can use evaluation tool to evaluate student’s work. Students can share contents with others via portfolio.

Portfolios forr assessment & accreditation. Usually combined w/ portfolios for teaching & learning.

Rhode Island network for technology worked with RI dept. of ed. All 2008 graduates submit a graduation portfolio. teachers create matrices for learning outcomes, students submit work for each cell, teachers rate work. Everyone generates a report, students, faculty, and admin districts.

You can do work in one space and repurpose it for another space.

Client group: need sincere interest, enthusiasm, identified purpose, disciplinary standards, outcomes, and objectives, availability of champions, multi year commitment to process, available resources.

New sakai tool interface with 2.5. portfolio templaes, layouts, styles, and reports need coding.

How do you create a culture on campus that supports portfolios?

  • Every campus is a collection of subcultures, unique to them and common to the campus culture.
  • Changing any part of the cultural system requires an adjustment to the whole system.
  • Understanding culture helps motivating & persuading, developing specific strategies & plan of action.
  • Challenges: lack of faculty & student time and energy, lack of funding, lack of understanding of value of epotfolios, lack of willingness of administrators to promote, lack of tech resources, tech phobia or change phobia.

What are student beliefs and values regarding success as a student?

  • available power resources
  • good teaching.
  • peer influence & grapevine

Motivators for student: improves learning outcomes; improves thru creativity & increased control. Better positioning for career, more relevant to real world, improved student resources, enhancment of self-esteem and confidence. Receive cedit and recognition for doing activities.

For faculty: belief about teaching & learning, avail power sources, faculty accomplishments rewarded by system, peer influence. Needs to solve problems for faculty, less time required for necessary tasks, promotion & tenure, awards & recognition.

Must have buy-in from faculty. Often-time students will push, e.g. for course content mgt system. Starts with a few early adopters, usually w/ lots of students. Beware of multiple systems. There is the element of commitment to student of hosting their material for several years.

Administrator beliefs & values

www.osportfolio.org

www.sakaiproject.org

www.rsmart.com

Who owns the intellectual property in a portfolio? FERPA. Minnesota law interprets is that the individual owns the data. Some institutions lay claim to the student’s work. Sometimes the Institutional Review Board (human subject research) may be involved.

JA-SIG 2008 : “Customizing DSpce using Manakin”

May 6, 2008

Workshop presented by Mark Diggory.

Check out his powerpoint at Manakin workshop

Manakin : publishing framework called cocoon, which separates layers of process of producing the presenting of a page. content generation, transformation phases, serialization takes SACS events and serializes them to an xml stream. Could also be text or html, even binary content

Goal: a more modular, extendable, tiered interface using css

Putting more metadata in the item page to support semantic web scanning

CSS & html layouts. manakin has great set of defaults. can do branding at community and collection and item levels. However right now it is done on the server by the developer.

Visualization of metadata : Texas A&M geologic atlas of the united states. They put google maps on dspace, also google timeline. google maps produces kml file that is embedded in the repository’s view.

Customizable advanced search interface

dome@MIT : image manakin repository

Tiers

  • style : create simple themes, xhtml and css
  • theme : ccreate complex themes. xsl xhtml css
  • aspect : add new features, cocoon and java

Documentation

  • dspace manual
  • theme writing tutorial
  • dspace wiki
  • mailing lists

Apache cocoon

  • web dev framework
  • sax based xml
  • pipeline based architecture

Modular components

  • generators
  • transformers
  • serializers

Pipeline model

sitemap defines how page is “transformed”, tags are added and removed and added in serialized fashion

3 components

  • aspects
  • themes
  • DRI schema

Aspects have a process of transformation also, following a sitemap. Phased content generation through aspects phases chained into a dri document and passed to style application that assigns themes.

dri schema : digital repository interface schema

  • gives you abstract representation of a repository page
  • metadata elements
  • references to METS
  • DIM, DC, QDC, extendable to others
  • structural elements
  • TEI light

dri is about representing the logical content of page, what objects are referenced (collections, items),

<document>

structural elements based on TEI in <options>

structural elements based on tei in <body> artifact metadata (METS)

web-oriented metadata in <meta> : title, language, style sheets, etc.

</doc…>

aspects implement sets of related features.

  • applied to all pages
  • interact w repository
  • aspect chain
  • input dri – aspect process adds more content in stages
  • output dri

core aspects:

  • artifact browser
  • eperson
  • submission
  • administrative

possibilities:

  • shopping cart
  • specialized searching
  • customized workflow

all aspects are self contained

  • java source code
  • javascript flow script
  • xml configuration

themes stylize content

  • xsl, css, images, static resources
  • Mark thinks it can be packaged in jars

content generation style application

artifact browswer –> eperson –> submission–> html rendering

theme components

  • sitemap
  • heart of theme
  • config file
  • referencees components

xsl

converts dri to html

theme components

takes dri document, applies sitemap that applies xsl and i18n, produces an xhtml doc to which css is applied

xsl is pattern matching : you match a pattern and instruct what html code to use. e.g. span class vs. <b>

you can import xsl stylesheets. dri to xhtml involves a library of xsl stylesheets.

xsl library feeds a single xsl template which is called by sitemap to manage transformation from dri to xhtml document.

a library can include

  • package instructions
  • structural.xsl
  • metadata xsls:
  • MODS-Handler.xsl
  • QDC-Handler.xsl
  • General-Handler.xsl

metadata handlers

4 display types, different versions for DIM, METS, and QDC

  • summarylist
  • summaryview
  • detailed list
  • detailed view

common modifications

  • headerfooter
  • item icon
  • item logos

header & footer

some of the themes may not be portable because of the developer’s metadata conventions.

localhost:8080/xmlui

localhost:8080/xmlui/?XML

localhost:8080/xmlui/DRI

dspace-test.mit.edu

mit has a download this item as a package utility dev’d through open courseware program.

configure tomcat to always look at webapps

dspace jspui has a bunch of help pages. manakin does not have help pages.

may have to copy static html and creating links from manakin

where to put in htmls? could put them next to static content folders, such as images or lib. may be theme-specific, too.

you can use aspect level to get system to look at specific place for changed i18n translations, e.g. where you are changing text for different collections with unique submission workflows.

localhost:8080/metadata/handle/xxxx/mets.xml

What is a theme : going from dri to html is theme

What is an aspect : going from dri to dri is aspect

What is dri : dri is an xml document format

JA-SIG 2008 : My notes on the workshop “Upgrading your DSpace Installation from 1.4 to 1.5″

May 6, 2008

Workshop presented by Mark Diggory.

Check out his powerpoint at DSpace 1.5 Upgrade

Mark and Scott are working really hard trying to clean 1.5. Mark’s advice is to wait until 1.5.1. Build problems being found already with 1.5.

The new system represents a code reorganization.

  • new maven build system
  • new services
  • new configurability

Separates java into functional units api, oai, jsp-ui

Reorged web ap svc, oai, jsp -ui

Added sword, lni, xml-ui new web app svcs

Allows for better customization (overlays) : you can keep customizations very separate from code base

lni, sword, oai, jsp-ui, xml-ui will become 3rd party endorsed. diggory looking at srw from oclc.

1. dspace 1.4: config files, web app resources, dependency libraries, java source files.

2. 1.5: src folder : ant build & maven assembly descriptors. java source split into upper level folders. jsps same and moved into webapp folders.

Dependency folders went away. Maven is a software project tool that enforces a set of conventions on project.

Dev tools like ant and make allowed too much flexibility in previous versions… creates unpredictability in build tools.

Maven

  • convention driven
  • build management
  • dependency resolution: manages third party dependencies
  • build modularization

Maven complexities

  • learning curve
  • distributed configuration
  • requires network internet access
  • much larger project than ant, many many sub projects and plugins
  • many more organizations working on it, with some competition. documentation on plugins requires a lot of research.

Release a project and other artifacts. creates dspace-api, becomes dspace=api.jar, goes to maven central repository, build downloads as a dependency. You build dspace with your custom changes, it joins with dspace 2.5 project pom.xml. You get custom dspace installer via ant

pom.xml=project module, with info on version of postgres, details of dspace-parent, version of dspace, etc. parent can be treated as remote artifact, does not have to be on same hard drive.

New services in 1.5/dspace/modules:

  • customizable modules
  • original jsp-ui
  • lni
  • oai (oai-pmh)
  • sword
  • manakin (xml-ui)

should not have to customize ../src/ files

compile against java 5. dspace is dependent on java 5.

Each maven project

  • adheres to conventions
  • /src/main/webapp
  • src/main/resources
  • src/main/java

Each module an overlay, each replaces the original from webapp, keeps customizations out of original source. target folder, src folder

New configurability

  • stackable authentication
  • configurable browse
  • configurable submission
  • separate new module configurations
  • this process requires better documentation, people havign problems with this.

mit libraries set up cvs and preserved in repository and deployed against test servers. Everytime committers changed a file Mark could diff it.

dspace.cfg file completely reorganized. broke into modules. Doing a merge between this and 1.4 dspace.cfg is challenging. Diggory went through 1.4, isolated all the changed parameters, then went to 1.5 and compared against each associated file in 1.5. tried to maintain as close to 1.4 as possible, “even if it was just whitespace”… Mark feels the next major release of dspace may be even more different.

Mark does not recommend upgrade approach in original documentation.

Authentication stack will not use because called authenticate.AuthenticationManager instead of .eperson

Dspace 1.5 allows you to sort browse options

Impact

  • new prerequisite-Maven
  • code reorg
  • database schema changes
  • dspace.cfg changes

Mark planned for upgrade at least 2 months.

  • backup everything often
  • trackcustomizations
  • map migration path
  • ask quesions
  • practice, practice, practice

MIT backs up postgres and assetstores in subversion repositories nightly

  • /usr/bin/pg_dump –create –oids \ -U postgres -f backup.sql dspace (check pdf of powerpoint slide) with a restoration script
  • assetstore.dir,history.dir, report.dir, log.dir, search.dir,bin, config. etc, lib, handle-server, webapps

Upgrade process

  • build w maven
  • install w ant
  • upgrade database
  • rebuild search/browse

Try using eclipse – works on user interface look sees. works w/ maven.

dspace 1.4 to 1.5 conversion

  • Copied modified jsps to dspace/modules/jspui/src/main/webapp/layout
  • Bring over dspace.cfg changes parameter at a time. do not follow documentation recommendation and just copy over.
  • for building w/ maven it is better to be in the specific directory. so change to /dspace1.5/dspace b4 building

maven

  • has a dependency management module that tells you what versions of what files should be used. good.
  • maven does a build plan that can be previewed.

Need to shut down tomcat before modifying database because tomcat retains connections to the db.

Includes backup scripts for various directories, webapps, lib, etc. Diggory added config. He configures tomcat to put backup files in specific directory..

news now is xml using dri

Error log issues :

  • log4j logs purpose is is for error analysis. Statistics module is now scraping these logs, but Diggory feels this is not a good approach.
  • Attempts to separate user activity logs from system/development activity logs. dspace.log has too much information in it, hard to debug when you are bombarded with all the user activity. They are developing “consumer” logs to reflect user activities and separate from error reporting.

Input-forms.xml, dstat.cfg have not changed much.

Crosswalks do sword-related stuff.

Mark ran psql -U dspace /dspace/etc/database_schema14-15.sql dspace

8080/jspui – lets you access old interface

8080/xmlui – lets you access manakin interface

pom.xml lists out dependencies.

ItemTag.java altered to prevent non-MIT users to print materials. Mark not too happy with this soln, but…

Try dspace community irc chat for info.

Next 1.5 release due in about a month. Don’t upgrade yet. Docs have to be improved, bugs stabilized.