Digital	Workflows	at	the	
National	Agricultural	
Library	and	Implications	
for	Preservation	
February	2018	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
Morgan	Daniels	
Postdoctoral	Fellow,	University	of	Maryland	College	Park	
morgan.g.daniels@gmail.com	
	 	
	
	
Executive	Summary	
	
This	study	was	designed	to	surface	needs	for	an	organization-wide	digital	preservation	
infrastructure	at	the	National	Agricultural	Library	by	examining	the	processes	currently	used	at	
NAL	in	routine	work	with	digital	materials.		It	used	an	observation-based	interview	method	to	
learn	directly	from	staff	members	about	their	workflows	with	digital	objects,	combining	the	
information	gathered	into	models	that	depict	their	work.		The	report	is	organized	to	follow	each	
of	the	four	major	digital	workflows,	ending	with	a	discussion	of	the	implications	of	the	study	for	
an	overarching	digital	preservation	program	at	the	library.			
	
The	discussion	of	digital	workflows	begins	with	the	process	of	digitizing	materials	from	the	
collection	(page	7).		Digitization	workflows	and	their	implications	for	preservation	are	aligned	
with	the	group	doing	the	digitization:	when	digitization	is	done	in-house	by	the	Digitization	and	
Access	branch,	preservation	copies	of	documents	are	saved	to	an	NAL	drive,	but	when	Internet	
Archive	contractors	complete	the	process,	materials	are	stored	on	Internet	Archive	servers	and	
NAL	does	not	host	a	preservation	copy.		This	report	strongly	suggests	building	an	automated	
process	into	this	workflow	to	obtain	preservation	copies	of	Internet	Archive	hosted	materials	for	
NAL.		In	addition,	some	ad-hoc	digitization	is	done	throughout	the	library	to	serve	specific	needs	
(particularly	patron	requests	for	portions	of	documents)	and	in	these	cases,	the	resulting	digital	
documents	should	not	be	considered	preservation-worthy.		The	second	digital	workflow	
examined	in	this	report	is	the	creation	of	online	exhibits	(page	13).		Online	exhibits	and	NAL	
created	and	collected	websites	should	be	considered	candidates	for	robust	digital	preservation,	
with	an	emphasis	on	the	content	and	format	of	the	communication.		The	underlying	digital	
materials	used	in	exhibits	should	be	preserved	primarily	through	the	digitization	workflow,	
which	creates	a	high	quality	digital	copy.		The	third	process,	metadata	quality	control	(page	19),	
is	situated	largely	within	NAL?s	Unified	Repository.		Issues	in	this	workflow	are	related	to	system	
response	times	and	a	need	for	more	staffing,	to	keep	up	with	increasing	demands.		Finally,	the	
report	explores	the	process	of	curating	research	data	(page	23)	in	which	staff	use	several	
processes	to	complete	similar	work	in	several	systems	housing	different	types	of	data.		While	
these	systems	offer	a	valuable	range	of	interactions	for	data	users,	this	report	recommends	
designating	Ag	Data	Commons	as	a	single	preservation	repository	for	all	data	types	hosted	by	
NAL.			
	
While	this	research	suggests	improvements	to	each	of	the	workflows	discussed	in	the	report,	
there	are	several	larger	takeaways	that	affect	NAL	including,	but	also	beyond,	the	digital	
practices	discussed	here.	Almost	every	interviewee	mentioned	understaffing	as	a	debilitating	
factor	at	NAL.	A	larger	staff	would	ease	the	burden	of	current	employees,	who	in	one	group	
reported	that	they	complete	the	workload	that	was	once	done	by	a	team	of	ten,	with	a	current	
team	of	only	two	people.		Understaffing	causes	employee	burnout,	process	delays,	and	an	
inability	to	focus	on	innovation,	as	smaller	groups	of	people	attempt	to	meet	or	exceed	
productivity	levels	of	a	larger	staff	in	the	past.		
 2	
	
The	need	for	a	comprehensive	digital	preservation	program	at	NAL	is	both	a	motivating	factor	
and	a	major	finding	of	this	report.		The	sections	of	the	report	detailing	consolidated	workflow	
models	offer	specific	recommendations	for	improving	workflows,	while	the	proposed	
preservation	system	diagram	and	discussion	at	the	end	of	the	report	give	suggestions	for	
improving	the	digital	preservation	practices	of	NAL	as	a	whole.		In	brief,	these	suggestions	are	to	
update	the	Fedora	installation	used	by	the	Unified	Repository	to	take	advantage	of	the	robust	
preservation	features	offered	in	newer	versions	of	the	software	and	accommodate	a	greater	
range	of	digital	materials	and	to	join	a	digital	preservation	consortium	to	take	advantage	of	the	
greater	scale	of	preservation	infrastructure	that	such	consortia	make	possible.	
	
A	third	organization	wide	need	surfaced	by	this	report	is	the	need	to	communicate	about	
workflows	and	practices	between	and	among	units.		Improved	communication	about	practices	
will	help	employees	make	connections	between	their	work	and	their	colleagues?,	benefitting	
cross-unit	collaboration	on	shared	concerns.		Several	initiatives	at	NAL	push	the	organization	in	
the	right	direction	toward	this	goal,	including	the	Brown	Bag	series,	where	staff	present	to	each	
other	about	their	work;	and	cross-organizational	working	groups,	such	as	one	which	recently	
met	about	metadata	types,	workflows,	and	standards	across	the	library.		These	and	similar	
efforts	should	be	supported,	as	they	contribute	greatly	to	the	culture	at	NAL.	
	
Digital	preservation	itself	provides	an	opportunity	for	cross-organizational	collaboration	because	
it	effects	the	work	of	all	branches	of	NAL.		Such	a	collaboration	is	suggested	in	the	conclusion	to	
this	report,	to	be	structured	as	a	cross-organizational	digital	preservation	working	group,	with	
leadership	from	ISD	and	participation	from	two	staff	members	from	each	branch.		For	each	unit,	
beginning	with	the	findings	around	current	practices	in	each	group	as	discussed	in	this	report,	
the	staff	members	from	that	unit	should	review	and	verify	the	findings	given	here,	augmenting	
them	with	their	own	findings	about	materials	that	require	preservation	and	key	workflow	steps,	
if	missing	from	these	models.		As	a	group,	this	team	would	determine	the	retention	and	
preservation	requirements	of	the	many	types	of	digital	materials,	resulting	in	an	ever-growing	
shared	register	of	digital	assets	at	NAL.		More	suggestions	for	this	collaborative	group	can	be	
found	in	final	section	of	this	report,	which	looks	at	digital	preservation	from	a	library-wide	
perspective.	
		
ISD	is	a	good	choice	to	spearhead	this	project,	due	to	their	current	role	in	backing	up	library	
assets.		At	present,	NAL	servers	and	virtual	machines	are	backed	up	daily	and	written	to	tape	
(which	is	stored	offsite)	weekly,	with	an	annual	back	up	copy	also	stored	offsite.		The	current	
back-up	practices	are	a	necessary,	but	not	sufficient,	step	towards	robust	digital	preservation.		
The	gap	between	current	back-up	practices	and	a	robust	digital	preservation	plan,	and	ways	to	
bridge	it	which	surfaced	through	the	course	of	this	study,	are	discussed	throughout	the	report.	
	 	
 3	
Table	of	Contents	
Executive	Summary	................................................................................................................	2	
Study	Design	..........................................................................................................................	5	
Participants	............................................................................................................................	5	
Table	1.	Interview	Participants	.........................................................................................................	5	
Workflows	..............................................................................................................................	6	
Table	2.	Workflows	Documented	and	Consolidated	.........................................................................	6	
Work	Process	1:	Digitizing	collection	materials	.......................................................................	7	
Table	3.	Participants	interviewed	for	work	process	1:	Digitizing	collection	materials	........................	7	
Digitization	Scenarios	.......................................................................................................................	7	
Consolidated	workflow:	Digitizing	materials	from	the	collection		.....................................................	8	
Analysis	and	Recommendations:	Digitizing	collection	materials	.....................................................	11	
Digitization	standards	....................................................................................................................	11	
Work	Process	2:	Creating	an	online	exhibit	............................................................................	13	
Table	4.	Participants	interviewed	for	work	process	2:	Creating	an	online	exhibit	............................	13	
Consolidated	workflow:	Creating	an	online	exhibit	........................................................................	14	
Analysis	and	Recommendations:	Creating	an	online	exhibit	...........................................................	16	
Work	Process	3:	Metadata	quality	control	for	publications	....................................................	19	
Table	5.	Participants	interviewed	for	work	process	3:	Metadata	quality	control	for	publications	....	19	
Consolidated	Workflow:	Metadata	quality	control	for	publications	................................................	20	
Analysis	and	Recommendations:	Metadata	quality	control	for	publications	...................................	22	
Work	Process	4:	Curating	research	data	.................................................................................	23	
Table	6.	Participants	interviewed	for	work	process	4:	Curating	research	data	.................................	23	
Consolidated	workflow:	Curating	research	data	.............................................................................	24	
Analysis	and	Recommendations:	Curating	research	data	................................................................	28	
Conclusion:	Recommendations	and	system	diagram	for	digital	preservation	at	the	library	as	a	
whole	....................................................................................................................................	30	
Infrastructure	Recommendations	...................................................................................................	30	
Workflow	Recommendations	.........................................................................................................	30	
Proposed	Digital	Preservation	System	Diagram	for	NAL	..................................................................	32	
Organizational	Recommendations	..................................................................................................	32	
Appendix	1	Project	Charter	....................................................................................................	34	
Timeline	.........................................................................................................................................	35	
Responsibilities	..............................................................................................................................	35	
About	Contextual	Inquiry	...............................................................................................................	35	
Appendix	2	Interview	Protocol	..............................................................................................	36	
References	............................................................................................................................	37	
 
 
 	
 4	
Study	Design	
This	study	was	conducted	using	the	Contextual	Inquiry	method,	developed	and	described	by	
Karen	Holtzblatt	and	Hugh	Beyer.		Specifically,	it	used	methods	enumerated	in	Holtzblatt,	
Wendell,	and	Wood	(2005).		Originally	created	in	the	context	of	user	experience	design,	
Contextual	Inquiry	helps	researchers	capture	and	understand	the	work	participants	do	in	a	
particular	organization	or	work	setting.		Contextual	Inquiry	interviews	typically	begin	with	a	
brief	set	of	questions	(see	appendix	2	for	the	semi-structured	interview	protocol	used	in	this	
study)	followed	by	an	in-depth	walk-through	of	the	processes	individuals	use	to	do	the	work	
under	study.		The	walk-through	requires	investigators	to	adopt	an	apprentice	role,	learning	
about	work	processes	by	observing	and	asking	questions	about	current	or	recent	concrete	
instances	of	the	participants?	work.		
The	scope	of	this	study	concerned	the	creation	and	management	of	digital	objects	throughout	
the	National	Agricultural	Library,	seeking	to	discover	the	ways	in	which	staff	members	process,	
describe,	alter,	store,	and	preserve	digital	objects.		At	the	conclusion	of	each	interview,	the	
researcher	developed	a	visual	model	of	each	workflow	discussed.		By	consolidating	related	
models	into	one	overall	model	for	each	process,	the	researcher	produced	broad	scale	depictions	
of	the	work	required	to	produce	a	digital	product,	providing	an	organization-wide	perspective	
on	work	with	digital	objects.		Through	these	consolidations,	presented	and	described	in	this	
report,	NAL	staff	and	leadership	can	better	understand	how	the	work	of	many	units	contributes	
to	the	library?s	overall	efforts,	while	easily	seeing	problems	that	arise	in	the	process.		With	these	
models,	the	digital	preservation	needs	of	NAL	are	more	easily	mapped	and	identified.		See	
Appendix	1,	the	Project	Charter,	for	more	information	about	the	study	methods.	
Participants	
Twenty-two	NAL	staff	members	were	interviewed	for	this	study	between	February	and	August	
2017-	approximately	one	quarter	of	the	Library?s	full	time	staff.		Participants?	length	of	tenure	at	
NAL	ranged	from	one	year	to	34	years,	averaging	12	years	across	all	interviewees.		Beginning	
with	the	NAL	staff	list,	the	investigator	emailed	interview	requests	to	individuals	known	to	be	
involved	with	the	creation,	curation,	or	other	management	of	digital	objects.		This	approach	was	
combined	with	a	snowball	sampling	method	using	responses	to	question	9	of	the	interview	
protocol	?Who	else	should	I	talk	to	about	these	issues,	specifically	people	in	your	unit	or	people	
involved	in	your	workflow	for	these	materials??	(See	Appendix	2	for	the	full	interview	protocol).		
As	shown	in	Table	1,	below,	participants	were	drawn	from	all	divisions	of	the	National	
Agricultural	Library,	except	for	the	Office	of	the	Director,	which	does	not	have	hands-on	
involvement	with	the	workflows	under	investigation.	
	
Table	1.	Interview	Participants	
NAL	Unit	 Participants	
DPD:	Acquisitions	and	Metadata	 3	
DPD:	Digitization	and	Access	 4	
DPD:	Indexing	and	Informatics	 1	
 5	
IPD:	Digital	Library	 2	
IPD:	Information	Centers	 3	
IPD:	Customer	Services	 1	
ISD:	Applications	and	Systems	Technology	Branches	 3	
KSD	 5	
Total	 22	
	
Workflows	
There	are	four	major	digital	object	workflows	discussed	in	this	report,	as	seen	in	Table	2	below.		
These	workflows	represent	the	bulk	of	digital	object	processing	that	takes	place	at	NAL,	which	is	
largely	focused	in	the	following	areas:	digitization	of	materials	from	the	library?s	collection;	
digital	organization	and	communication	of	those	and	other	materials	through	online	exhibits;	
curation	of	research	data;	and	metadata	clean	up	and	correction	for	resources	created	outside	
NAL,	particularly	agricultural	science	literature	produced	both	within	and	outside	USDA.		These	
workflows	emerged	from	discussion	with	staff	members	across	NAL,	through	their	responses	to	
the	second	interview	question	?Please	describe	the	types	of	digital	materials	you	deal	with	in	
your	work	at	NAL	(including	?born	digital?	materials,	digital	representations	of	analog	material,	
and	web	hosted	databases).?		This	purpose	of	this	question	was	primarily	to	surface	the	range	of	
digital	materials	and	related	workflows	that	were	relevant	to	the	interviewees?	daily	tasks.	
	
The	majority	of	staff	members	interviewed	for	this	study	participated	in	the	four	workflows	
addressed	here,	while	several	individuals	were	involved	in	more	than	one	of	them.		The	number	
of	participants	listed	in	Table	2,	therefore,	does	not	directly	match	the	number	of	participants	in	
this	study.		The	following	sections	of	this	report	discuss	each	workflow	in	turn.	
	
Table	2.	Workflows	Documented	and	Consolidated	
Workflows	 Participants		
Digitizing	materials	from	the	collection		 8	
Creating	an	online	exhibit	 5	
Metadata	quality	control	for	publications	 6	
Curating	research	data		 5	
Total	 24*	
	 *several	interviewees	discussed	multiple	workflows	
	  
	
 6	
Work	Process	1:	Digitizing	collection	materials		
The	process	of	digitizing	materials	for	the	NAL	collection	was	a	clear	choice	for	inclusion	in	this	
study.		With	the	involvement	of	numerous	people	throughout	several	units	of	NAL	and	the	
Internet	Archive,	all	working	together	to	convert	print-based	collections	into	digital	files,	it	
presents	an	interesting	organizational	and	workflow	challenge.		Of	the	22	people	interviewed	
for	this	study,	8	were	directly	involved	in	the	process	of	digitizing	materials	from	the	collection,	
in	roles	ranging	from	selection	of	objects	for	digitization	to	managing	the	overall	digitization	
process.		Their	organizational	affiliations	within	NAL	are	represented	in	the	following	table.	
Table	3.	Participants	interviewed	for	work	process	1:	Digitizing	collection	materials	
NAL	Unit	 Process	 Number	of	
Interviewees	
DPD:	Digitization	and	 Digitizing	collection	materials	 2	
Access	
DPD:	Digitization	and	 Managing	digitization	workflow	(both	NAL	and	 2	
Access	 Internet	Archive)	
IPD:	Information	Center	 Finding	materials	for	a	digital	collection	 2	
IPD:	Customer	Service	 Digitizing	materials	in	response	to	a	user	 1	
request	
DPD:	Acquisitions	and	 Cataloging	digital	collections	 1	
metadata	
Total	 	 8	
	
Digitization	Scenarios		
Three	distinct	digitization	scenarios	emerged	from	interviews	with	staff,	corresponding	with	
different	processes	used	at	NAL.	The	Digitization	and	Access	branch	manages	both	their	own	
digitization	process	(shown	in	green)	and	the	work	done	by	the	Internet	Archive	(in	yellow),	
which	contracts	much	of	the	digitization	work	at	NAL.		Digitization	begun	in	response	to	a	user	
request	(in	blue	in	the	model	below)	might	take	place	entirely	outside	of	the	Digitization	and	
Access	branch	of	the	library,	done	instead	by	the	Customer	Services	branch	staff	member	
receiving	the	request.		
Note:	Other	NAL	units	involved	in	the	model	below	are	shown	in	black	type,	while	yellow	type	
indicates	a	group	outside	NAL.		A	red	lightning	bolt	icon	indicates	a	breakdown	in	a	process.	
	
		 	
 7	
Consolidated	workflow:	Digitizing	materials	from	the	collection		
Scenario	1:	Workflow	
	 Scenario	2:	Workflow	is	 Scenario	3:	Workflow	is	is	triggered	by	a	 triggered	by	the	process	of	 triggered	by	a	request	from	
researcher	request	 systematic	digitization	using	 a	partner	organization		
Internet	Archive	 		
Researcher	requests	 A	partner	organization	or	NAL	
materials	(through	 1.	Survey	collection	to	find	 staff	member	requests	a	digital	
Reference	or	Special	 pockets	of	need	(at	present,	 copy	of	an	object	owned	by	
Collections) primarily	within	USDA	 NAL 
publication	and	nursery	and	seed	
trade	catalog	collections) 
2.	Query	catalog	in	MARC	856	
field	for	presence	of	a	URL 
3.	Track	survey	of	collection	in	a	
spreadsheet 
Ask	Special	
Are	materials	either	 Collections	
broadly	useful	to	 should	this	be	
NAL	users	or	 digitized	for	
lengthy?		If	so,	ask	 NAL?	 
digitization	unit	to	
help.		If	not,	digitize	 YES 
within	unit Should	digitization	be	done	by	NAL	or	
Internet	Archive?		
Consider	 NO 
A.	size	(if	item	is	too	
large	for	IA	scanner) 
YES B.	condition	(if	too	 Reject	the	request:	another	
NO fragile) partner	will	fulfill	it 
Move	request	to	tracking	system	(RefTracker	
for	Reference,	JIRA	for	Special	Collections) IA 
Find	relevant	materials	and	verify	relevance	 NAL 
with	researcher 
	 
Send	digital	item	to	researcher	via:	link	to	ARS	
file	sharing	site	(Special	Collections)	or	
Digitize	material	on	 RefTracker	attachment	(Customer	Services) 
equipment	in	unit,	email	to	 Process	does	not	facilitate	reuse	 
self	  
 8	
Digitization	fulfilled	by	 Digitization	fulfilled	by	Internet	
Digitization	&	Access	group	at	 Archive	
NAL	  
 
1.	Create	an	entry	in	Jira,	assign	to	staff	 Once	selected,	pull	all	copies	from	shelf	 
member: track	hours	spent	on	process	for	an	 1.	Sort	and	collate,	selecting	best	copies	(least	
idea	of	labor	costs	 fragile	for	Internet	Archive	to	handle) 
2.	Mark	the	request	?In	progress?	in	partner?s	 2.	Remove	staples,	paperclips	if	they	get	in	the	way	
admin	tracking	module	(IBHL?s	tool,	Gemini	) of	scanning	process 
3.	Create	a	Jira	record	for	each	batch 
Software	creates	both	TIFF	and	JPG2000	
derivatives,	moving TIFF	to	Special	Collections	 1.	Use	Perl	script	from	ISD	Applications	Branch	to	
Masters	folder	as	the	preservation	copy pull	metadata	from	catalog	and	format	for	IA	in	XLS 
Access	copy,	JPG2000,	is	moved	to	Ready	for	 2.	If	there	are	errors	in	the	catalog	records,	make	a	
Uploading	folder,	which	also	contains	a	MARC	 spreadsheet	detailing	typos,	title	matching,	other	
xml	file	for	the	serial	record,	exported	from	 issues,	and	send	monthly	to	the	cataloging	group	
OCLC email	address 
1.	Assigned	staff	member	retrieves	items	from	 Internet	Archive	staff	return	a	spreadsheet	which	
stacks,	leaves	a	shelf	marker,	brings	items	to	 itemizes	the	number	of	fold	outs	and	image	counts,	
digitization	area and	gives	links	to	published	images	in	IA 
2.	Staff	member	verifies	catalog	record,	enters	
MARC	016	and	call	number	in	a	csv,	and	notes	
Special	Collections	approval 
3.	Logs	work	in	Jira	as	volume	scanning	is	begun	 Quality	Review	procedures	instruct	staff	to	review	a	
and	completed sample	of	the	returned	materials,	looking	at	
4.	[if	applicable]	Adds	partner-required	 thumbnails	and	metadata 
metadata	for	each	volume 1.	Does	title	match?		Is	image	good?		Is	disclaimer	
page	present?	 
2.	If	there	are	missing	pages,	also	check	neighboring	
items 
3.	Create	a	worksheet	within	the	spreadsheet	to	
track	required	changes 
4.	Send	xls	to	Internet	Archive	site	manager	for	
correction 
5.	Once	corrected,	alert	IA	that	NAL	has	accepted	
the	batch 
 9	
	
	
	 	
Send	spreadsheet	to	ISD	Applications	Branch	on	a	monthly	
basis	to	add	Internet	Archive	links	back	to	NAL	catalog	in	856	
field--	Location:	Electronic	Resource 
Update	spreadsheet	tracking	digitization	progress	in	the	
collection	 
Materials	are	not	systematically	harvested	from	Internet	
Archive?	NAL	does	not	own	or	preserve	a	digital	copy	 
A	regular	Cron	job	(Perl	script)	from	ISD	
Applications	Branch	uploads	the	materials	to	
Internet	Archive.	It	runs	at	noon	and	4pm	to	see	if	
any	new	materials	need	upload.		Sends	an	email	
stating	successful	or	failed	Uploading	can	be	a	
problem	if	a	batch	is	too	big,	easier	to	troubleshoot	
with	a	smaller	quantity 
 10	
	
		
Analysis	and	Recommendations:	Digitizing	collection	materials		
The	digitization	process	is	the	most	complex	workflow	consolidation	presented	in	this	report.		
Because	digitization	requests	may	arise	from	several	sources,	including	NAL	users,	partner	
organizations	(such	as	the	Biodiversity	Heritage	Library),	and	NAL	staff	member	requests,	
procedures	vary	somewhat	depending	on	origin	of	the	request	and	the	purpose	to	which	the	
digital	object	will	be	put.		The	diagram	above	consolidates	all	three	of	these	scenarios	into	one	
model,	using	color	to	differentiate	between	them.		Yellow	is	used	to	depict	digitization	done	by	
the	Internet	Archive	as	an	NAL	contractor,	green	depicts	the	process	as	fulfilled	by	Digitization	&	
Access	branch	members,	and	blue	shows	digitization	done	within	another	unit	of	the	library.	
Each	of	the	digitization	scenarios	has	implications	for	the	preservation	and	reuse	of	digital	
objects.	Both	the	green	and	yellow	diagrams	above	show	that	NAL	uses	the	Internet	Archive	as	
a	preservation	platform,	creating	an	NAL-owned	version	of	a	digital	document	on	an	NAL	
storage	drive	labeled	?Masters?	only	when	materials	are	digitized	in-house.		In	the	yellow,	
Internet	Archive	digitized	condition,	we	see	that	from	the	perspective	of	NAL?s	digital	
preservation	responsibilities,	the	process	is	inadequate.		NAL	staff	add	Internet	Archive	links	to	
their	catalog,	but	do	not	systematically	download	the	files	to	the	NAL	infrastructure.		One	major	
recommendation	of	this	report	is	the	addition	of	a	process	to	routinely	transfer	new	files	
created	by	the	Internet	Archive	into	NAL?s	Unified	Repository.		Master	copies	of	these	digital	
files	(derived	from	both	workflows)	should	be	part	of	a	robust	NAL-controlled	digital	
preservation	infrastructure	for	long-term	stability.	
In	the	blue	diagram,	depicting	cases	in	which	a	staff	member	in	a	customer	services	role	decides	
to	digitize	some	part	of	an	item	in	response	to	a	user	request,	the	document	is	essentially	lost	
immediately	from	a	preservation	and	reuse	perspective?	it	has	only	short-term	usefulness,	in	
answer	to	a	request,	and	will	not	be	saved	in	a	way	that	facilitates	future	finding	or	reuse.		
(Alternately,	if	the	document	is	deemed	to	have	long	term	usefulness,	it	joins	the	Digitization	
and	Access	team?s	workflow,	making	it	more	broadly	accessible.)		Several	staff	members	in	the	
special	collections	and	customer	service	groups	voiced	the	desire	to	make	these	materials	
available	for	reuse.		This	report	recommends,	however,	that	because	such	materials	are	digitized	
in	an	ad	hoc	way,	without	conforming	to	shared	quality	or	storage	standards,	and	often	do	not	
represent	an	entire	publication	but	an	excerpt	instead,	they	should	continue	to	be	considered	
single-use	copies,	to	be	retained	only	for	the	immediate	future	but	not	the	long	term.				
Digitization	standards		
As	a	source	for	technical	standards	for	digitization,	the	Digitization	and	Access	branch	of	NAL	
follows	the	Federal	Agencies	Digital	Guidelines	Initiative	(FADGI)	Still	Image	Working	Group?s	
publication	Technical	Guidelines	for	Digitizing	Cultural	Heritage	Materials:	Creation	of	Raster	
 11	
Image	Files	(Rieger	2016).		This	document	provides	recommended	standards	for	digitizing	still	
image	materials,	including	factors	like	master	file	formats,	resolution,	and	bit	depth.		For	each	
material	type,	the	FADGI	guidelines	give	recommendations	using	a	one-to-four	star	rating	
system,	where	one	star	represents	low	quality	images	and	four	stars	represent	best	practices.		
?Three	star	imaging	defines	a	very	good	professional	image	capable	of	serving	for	almost	all	
uses.		Four	star	defines	the	best	imaging	practical	today.	Images	created	to	a	four	star	level	
represent	the	state	of	the	art	in	image	capture	and	are	suitable	for	almost	any	use?	(Rieger	
2016,	p	9).		By	aligning	imaging	standards	with	FADGI	guidelines	for	three	and	four	star	ratings,	
the	NAL	Digitization	and	Access	branch	assures	that	the	images	created	in	the	digitization	
process	will	be	high	quality	and	adaptable	to	a	myriad	of	end-user	needs.			These	standards	
should	continue	to	be	monitored	and	adopted	by	NAL,	and	adhered	to	by	digitization	
contractors.		By	assuring	that	FADGI	digitization	standards	are	followed,	NAL	brings	itself	into	
line	with	standards	followed	by	other	federal	agencies	while	creating	the	highest	quality	digital	
surrogates	possible,	appropriate	for	long	term	preservation.		
	
	 	
 12	
Work	Process	2:	Creating	an	online	exhibit	
The	second	workflow	presented	in	this	study	is	the	creation	of	online	exhibits	using	the	Omeka	
web	publishing	platform.		Omeka	is	used	largely	by	two	groups	within	NAL:	the	Information	
Products	Division	(IPD)	and	the	Special	Collections	unit.	Of	the	22	people	interviewed	for	this	
study,	5	were	directly	involved	in	the	process	of	creating	Omeka	exhibits	for	the	library.		Their	
organizational	affiliations	within	NAL	are	represented	in	the	following	table.	
Table	4.	Participants	interviewed	for	work	process	2:	Creating	an	online	exhibit	
NAL	Unit	 Process	 Number	of	
Interviewees	
DPD:	Digitization	and	 Creating	an	online	exhibit	 1	
Access	
IPD:	Information	Centers	 Processing	materials	for	a	digital	collection	 3	
IPD:	Digital	Library	 Creating	an	online	exhibit	 1	
Total	 	 5	
	
In	the	following	diagram,	the	color	blue	represents	actions	taken	primarily	by	members	of	the	
Information	Products	Division	or	the	Special	Collections	group,	as	they	seek	to	create	a	new	
online	exhibit.		Green	is	used	to	depict	the	work	done	by	the	Digitization	and	Access	group	to	
digitize	materials	for	a	web	exhibit.		As	in	the	previous	workflow	diagram,	other	NAL	units	
involved	in	the	model	below	are	shown	in	black	type,	while	yellow	type	indicates	a	group	
outside	NAL.		Red	lightning	bolt	icons	indicate	breakdowns	in	a	process.	
  
 13	
	
Consolidated	workflow:	Creating	an	online	exhibit		
 
	 Request	a	new	Omeka	instance	from	IPD	and	identify	items	to	
include 
	 	
Look	for	a	digital	copy	of	each	item	from	
sources,	in	order	of	preference:		 
1. NAL	Unified	Repository 
2. Internet	Archive	NAL	collection 
3. S	drive 
4. Fedsys	(GPO) 
Exhibit	will	use	a	link	to	one	of	these	
sources,	if	available,	not	resulting	in	a	
preservation	copy	owned	by	NAL	
 
If	no	digital	copy,	look	for	a	hard	copy 
1. In	stacks 
2. Ask	colleagues	 
In	tracking	spreadsheet: 
1. Add	MARC	016	identifier	if	it	already	has	a	Voyager	record 
2. Add	PDF	file	name	and	location	if	an	electronic	copy	exists	at	NAL 
3. Add	IA	identifier	if	an	electronic	copy	exists	in	Internet	Archive 
4. Record	source	of	publication	where	found	
Send	to	Digitization	and	Access	branch	
 14	
	
	 	
Digitization	and	Access	adds	columns	for:	 
1. Digitization	flag	number,	 
2. Next	Steps:	?Harvest	from	IA?;	?Tag	for	(Info	
Center)?;	?Add	collection	name	to	catalog?;	
?Upload	to	UR,	MODS	to	MARC	transformation?	
for	items	found	in	FedSys.		 
Digitizes	item	(where	needed) 
Forwards	the	spreadsheet	to ISD to	add	persistent	
identifier	(agid) 
Digitization	team	returns	an	
Excel	spreadsheet	with	links	to	
digitized	versions	of	publications	
in	Internet	Archive 
Retrieve	each	item	via	the	
Internet	Archive	link	to	download	
selected	images	to	desktop 
 15	
	 Open	Omeka	to	add	digital	item/s,	
supply	Dublin	Core	metadata,	copy	and	
upload	an	excerpt	from	the	text,	where	
appropriate 
Use	Omeka	plug	in	for	exhibits	to	link	and	
organize	items 
Create	a	timeline	using	timeline.js	if	
appropriate	to	the	exhibit 
Send	Word	version	of	the	exhibit	to	the	
USDA	Office	of	Communication	for	
editing/	approval 
Once	approved,	IPD	runs	a	508	
compliance	check	using	Total	Validator	
Pro 
Test	the	site	and	when	ready,	ask	ISD	to	move	it	to	the	Production	server,	where	it	is	live	
and	publicly	available	No	robust	web	preservation	practices	are	currently	in	place. 
While	servers	are	backed	up	by	ISD,	web	preservation	is	not	yet	done	at	NAL 
 16	
Analysis	and	Recommendations:	Creating	an	online	exhibit	
 
In	the	course	of	interviews	with	NAL	staff,	it	emerged	that	IPD	and	Special	Collections	currently	
use	different	Omeka	templates	for	their	sites,	resulting	in	exhibits	with	two	distinctive	looks.		
Examples	of	current	sites	created	by	the	two	groups	follow:	
	
	
How	Did	We	Can?	The	Evolution	of	Home	Canning	Practices	
https://www.nal.usda.gov/exhibits/ipd/canning/	
	
	
An	Illustrated	Expedition	of	North	America	
https://www.nal.usda.gov/exhibits/speccoll/exhibits/show/an-illustrated-expedition		
	
In	practice,	all	Special	Collections	Omeka	materials	are	grouped	together	as	?exhibits,?	in	?one	
big	bucket?	as	one	staff	member	described	it,	while	IPD	exhibits	are	each	stand-alone	Omeka	
instances.		As	a	consequence,	IPD	exhibits	can	each	have	a	unique	look	and	feel,	which	is	
appreciated	by	IPD	staff,	who	feel	that	this	adds	visual	interest	to	the	exhibits.		The	Special	
Collections	group?s	approach	puts	a	greater	emphasis	on	standardization,	representing	each	
exhibit	as	part	of	a	Special	Collections	Exhibits	product.			
 17	
	
Having	multiple	Omeka	instances	has	greater	implications	for	upkeep	and	maintenance	than	it	
does	for	digital	preservation.		ISD	must	be	asked	to	update	multiple	sites	when	a	new	version	of	
Omeka	is	adopted	by	NAL,	which	may	be	more	difficult	the	more	variation	is	used	in	the	
exhibits.		This	workflow	and	standardization	question	should	be	addressed	through	a	
conversation	between	ISD,	IPD,	and	Special	Collections	stakeholders,	who	may	decide	that	the	
value	to	NAL	of	flexibility	of	individual	Omeka	instances	outweighs	the	difficulty	of	updating	
multiple	instances.	
	
From	a	digital	preservation	standpoint,	however,	this	question	does	not	loom	large.		Web	
exhibits	pose	two	kinds	of	preservation	challenges:	the	preservation	of	the	underlying	content	
used	in	an	exhibit	and	the	preservation	of	its	packaging	as	an	online	exhibit.		The	workflows	
used	by	NAL	staff	to	create	web	exhibits	suggest	an	emphasis	on	the	latter	type	of	preservation	
rather	than	the	former.		Since	exhibit	creators	use	excerpts	of	materials	found	elsewhere	at	NAL	
or	on	the	open	web,	the	goal	in	preserving	web	exhibits	should	be	in	preserving	the	content	of	
NAL	communication	with	the	public,	not	preserving	the	underlying	digital	objects,	which	should	
be	maintained	through	preservation	of	the	systems	on	which	they	reside	as	complete	
documents	(i.e.	the	Unified	Repository).				
	
From	a	digital	preservation	standpoint,	current	NAL	back-up	measures	protect	a	copy	of	digital	
exhibits,	but	a	more	comprehensive	approach	to	web	preservation	is	in	order,	to	capture	and	
package	content	for	long-term	preservation.		The	scope	of	NAL	websites	to	be	preserved	is	
broader	than	web	exhibits,	including	information	centers,	data	products,	the	library?s	home	
page,	and	other	NAL	provided	content	(along	with	other	web	materials	NAL	may	choose	to	
collect).		Emphasis	on	the	Omeka	based	exhibits	in	this	report	is	useful,	however,	because	it	
illustrates	the	variation	in	use	of	a	single	tool	in	departments	throughout	the	library.			
	
NAL	should	consider	two	products	from	the	Internet	Archive	for	preserving	web	content.		
Archive-It	is	a	hosted	service	widely	used	in	research	libraries	while	Heritrix	is	an	open-source	
solution,	which	would	require	more	hands-on	work	by	NAL	staff	in	its	implementation	(while	
some	might	consider	the	open-source	option	?free,?	the	cost	to	already	taxed	ISD	staff	time	is	
in	no	way	negligible).	
	
	
	
	
	 	
 18	
Work	Process	3:	Metadata	quality	control	for	publications	
Metadata	quality	control	at	NAL	takes	place	in	many	different	contexts,	with	numerous	types	of	
objects.		Metadata	quality	control	for	publications,	specifically,	is	a	key	process	involving	the	
work	of	several	people	invested	heavily	in	converting	metadata	from	publisher	feeds	and	author	
provided	sources	to	formats	compatible	with	NAL	systems.		Of	the	22	people	interviewed	for	
this	study,	6	were	directly	involved	in	ongoing	metadata	cleanup.	Their	organizational	affiliations	
within	NAL	are	represented	in	the	following	table.	
Table	5.	Participants	interviewed	for	work	process	3:	Metadata	quality	control	for	
publications	
NAL	Unit	 Process	 Number	of	
Interviewees	
DPD:	Indexing	and	 Subject	indexing	of	journal	articles	 1	
Informatics		
IPD:	Digital	Library	 Manage	review	process	for	USDA	author	 1	
submitted	articles	
ISD	 Correct	conversion	issues	in	publisher- 1	
provided	metadata	in	MARC	to	MODS	
transformation	
ISD:	Systems	Technology	 Create	article	information	for	PubAg	 1	
DPD:	Acquisitions	and	 Metadata	quality	control	 2	
metadata	
Total	 	 6	
Note:	Two	primary	workflows	emerged	in	the	process	of	metadata	quality	control,	which	are	
both	depicted	in	the	following	model.		The	two	workflows	are	characterized	by	the	metadata	
source?in	green,	articles	submitted	by	ARS	scientists	for	inclusion	in	the	NAL	repository,	and	in	
blue,	metadata	feeds	received	from	publishers.		The	final	steps,	abstracting	and	indexing	and	
publication	(shown	in	dark	blue),	apply	to	both	workflows.		Red	lightning	bolt	icons	indicate	
breakdowns	in	a	process.	
 19	
Consolidated	Workflow:	Metadata	quality	control	for	publications	
	 	
Workload	is	  
ARS	scientist	submits	a	publication	to	the	 too	great	for	
Fedora	repository	via	web	form the	team-	 New	metadata	received	from	publishers	
Metadata	correction	team	(3	people	or	 more	staffing	 (about	15	publishers	and	aggregators)	
fewer)	has	30	days	to	make	it	available	 is	required	
online  
Program	normalizes	metadata	into	MODS	using	
XSLT	transformation	stylesheets	written	by	
Within	Unified	Repository,	QC	manager	assigns	 metadata	librarians	(have	to	be	custom	written	
Quality	Control	in	batches	of	20	papers	to	each	 and	rewritten	when	publishers	change	how	they	
reviewer,	flags	problem	fields do	metadata)	
Two	people	do	quality	control	on	each	record 	
	 
Stylesheet	editing	
requires	considerable	
technical	proficiency	
Reviewer	opens	a	submission  
1.	Look	first	at	?Note?	field-	it	may	say	No	DOI,	or	
specify	other	problems	with	the	metadata	record Has	publisher	
2.	Use	either	?Edit	MODS?	or	?Manage?	tab	to	 changed	style	
edit	metadata sheet? 
? Manage	tab	is	useful	if	preferred	workflow	is	
to	put	the	MODS	file	in	Oxygen,	fix	it	and	 No	
paste	it	back Yes	  
? Edit	MODS	option	walks	the	reviewer	through	  
the	fields	individually 1. Log	in	to	publisher?s	web	form,	click	on	Available	batches,	
	 download	
2. Unzip	one	aip	file,	open	in	Oxygen	
3. Have	Oxygen	read	from	XSL	to	MODS	output	
4. Open	output	file	(saved	to	C	drive)	in	Oxygen	to	verify	that	it	
validates	
System	is	slow	to	expand	fields	for	QC	process.		 5. Put	stylesheet	on	test	server	to	test	for	errors		
One	person?s	workaround:	 
1. Use	a	SOLR	query	to	search	ID	#s	for	assigned	
tasks, 
2. Output	metadata	to	tab	delimited	format, Program	creates	object	for	the	repository	which	includes	source	
3. Use	LibreOffice	to	edit,	track	changes	and	note	 and	MODS	metadata,	it	goes	into	the	repository	marked	as	
issues ?waiting?	
System	processes	the	publisher	article	files	submitted	via	FTP,	
converting	MODS	to	MARC	for	the	ILS,	tags	items	for	NALDC	and	
PubAg	
(Continue	to	blue	box	on	next	page)	
	
 20	
 
	
	 	
 
Did	author	
submit	an	
article	DOI? 
If	YES,	all	metadata	should	be	 1.	If	NO	DOI	provided,	look	up	ISSN	and	DOI	using	journal	
present	and	just	requires	 and	article	names 
verification	 2.	Can	copy	metadata	from	a	reliable	source	such	as	
	 PubMed 
3.	Can	search	ISSN	Register	for	journal,	but	it	is	a	time	
intensive	option,	instead	perform	web	search	for	journal	
name	to	see	if	an	ISSN	can	be	located	on	the	web 
4.	If	there	is	no	DOI	or	ISSN,	flag	this	item	for	later	review	
(new	journals	and	title	acquisitions	can	hold	up	ISSN	and	
DOI	assignment,	they	may	be	assigned	by	publisher	later) 
	 
Check	other	metadata	against	the	article	pdf 
Review	other	assigned	items,	removing	local	notes	once	content	is	approved 
	 
Use	automatic	indexing	tool	to	pull	title,	abstract,	journal	title	in	a	custom	format,	creating	
subject	terms	and	corpus	for	each	item	
Approved	items	go	to	Abstracting	and	Indexing	team,	who	will	spot	review	(and	assign	indexing	
terms	if	there	were	fewer	than	3	assigned	automatically)	using	Annotation	Workbench	tool,	
mark	as	?issued?	when	deemed	ok 
NALDC	and	PubAg	query	the	UR	for	new	materials	since	the	last	pull,	
making	them	live	and	accessible	
 21	
Analysis	and	Recommendations:	Metadata	quality	control	for	publications	
	
The	processes	used	at	NAL	for	metadata	quality	control	are	largely	mechanized,	and	interviews	
surfaced	few	complaints	with	the	processes	themselves.		Where	issues	did	arise	for	staff	
members,	they	related	to	slow	response	times	in	the	Unified	Repository,	time	intensive	aspects	
of	the	work	(such	as	searching	the	ISSN	Register),	and	understaffing	issues	that	made	the	work	
more	difficult.		While	systems	development	at	NAL	has	served	metadata	quality	control	
processes	well,	there	is	clearly	a	need	for	a	larger	team	of	individuals	to	work	on	this	process.			
Some	aspects	of	metadata	quality	control	(particularly	working	with	publisher	stylesheets	and	
metadata	transformation)	require	specialized	technical	proficiency,	reducing	the	available	pool	
of	NAL	staffers	who	might	currently	be	able	to	join	this	team.		However,	reskilling	and	
professional	development	of	current	metadata	staff	would	help	build	more	capacity	for	this	
group.		In	addition,	hiring	new	staff	who	can	jump	into	the	metadata	correction	process	should	
be	a	priority	for	the	Library.	
	
When	staffers	found	problems	with	the	workflows	in	place,	they	developed	their	own	
workarounds.		One	example	is	an	individual?s	use	of	SOLR	to	query	the	UR	and	perform	work	in		
LibreOffice,	to	avoid	slow	reaction	times	in	the	QC	nodes	of	the	Unified	Repository.	This	
workaround	signals	room	for	improvement	in	the	UR	infrastructure?quicker	response	time	
would	make	the	process	simpler	for	users.		On	the	other	hand,	only	one	of	the	six	staff	
members	working	with	metadata	quality	control	mentioned	slow	system	times	as	a	problem,	
suggesting	that	the	others	did	not	perceive	system	times	as	an	issue.	
		
Because	metadata	correction	workflow	is	primarily	mediated	through	the	Unified	Repository,	
the	UR	is	a	major	target	for	digital	preservation	of	this	content.		With	robust	preservation	of	the	
UR	in	place,	these	materials	should	be	secured	for	long-term	access	and	use.		Steps	to	increase	
the	preservation	capabilities	of	the	UR	are	discussed	at	the	conclusion	to	this	report.	
	 	
 22	
Work	Process	4:	Curating	research	data	
At	the	National	Agricultural	Library,	research	data	curation	takes	place	within	the	Knowledge	
Services	Division,	but	work	processes	vary	greatly	in	that	division	depending	on	the	systems	
used	for	each	product,	as	the	model	below	illustrates.		Of	the	22	people	interviewed	for	this	
study,	5	were	directly	involved	in	research	data	curation.	Their	organizational	affiliations	within	
NAL	(all	within	the	Knowledge	Services	Division)	are	represented	in	the	following	table.	
Table	6.	Participants	interviewed	for	work	process	4:	Curating	research	data	
NAL	Unit	 Process	 Number	of	
Interviewees	
KSD:	Scientific	Data	 Curating	research	data	 2	
Management	
KSD	 Curating	research	data	 3	
Total	 	 5	
Because	of	the	variation	in	work	taking	place	between	the	three	data	systems	studied	for	this	
report	(Ag	Data	Commons,	Lifecycle	Assessment	Commons,	and	i5k)	the	three	systems	are	
depicted	independently	in	this	model.		Processes	can	be	read	in	parallel,	with	a	summary	of	
each	step	in	a	blue	figure	at	the	top	of	each	page.	
Note:	Other	NAL	units	involved	in	the	model	below	are	shown	in	black	type,	while	yellow	type	
indicates	a	group	outside	NAL.		Red	lightning	bolt	icons	indicate	breakdowns	in	a	process.	
	
	 	
 23	
Consolidated	workflow:	Curating	research	data	
	
	 Researcher	submits	data/	model/	genome,	which	generates	notification	
for	curator 
	
	 	
Open	and	view	files	locally	
Each	data	product	is	in	a	separate	system,	requiring	upkeep	of	
multiple	tools 
Ag	Data	Commons	(ADC) 
1.	Log	into	Ag	Data	Commons	 Life	Cycle	Assessment	 i5k	 
Workbench	(back	end	 Commons	(LCA)	 1.	View	submitted	files	and	
interface) Check	fidelity	of	model: information 
2.	Review	record	for	any	 1.	Create	a	new	database,	 (Submission	to	NCBI	is	a	
apparent	errors import	files	to	see	that	 prerequisite,	allowing	i5k	to	
	 they	open	properly,	close	 benefit	from	the	NCBI	ingest	
database	 process) 
2.	Open	SQL	client,	run	 2.	Run	md5sum	checksum	to	
SQL	scripts	that	check	 verify	that	transfer	happens	
models	run	by	specified	 correctly	(there	are	often	
rules issues) 
3.	Read	and	verify	
documentation 
	 
Checksum	is	routinely	run	only	in	
i5k,	the	other	systems	should	also	
run	similar	checks	
 24	
	
	 	
Edit	files 
LCA	 
ADC 1.	Track	all	issues	in	
Go	through	checklist	of	actions	(on	 spreadsheet	template,	use	as	a	 i5k	 
GitHub	wiki)	to	review	and	improve	 change	log Data	files	checked	
contributor	supplied	metadata 2.	Make	minor	changes	to	 programmatically	in	Apollo	
format	of	names	and	labels.	 system	using	scripts	
Major	changes	are	 created	by	a	postdoc	at	
recommended	in	worksheet	 National	Taiwan	University.		
but	not	changed	without	 ?After	a	correction	period,	
researcher	approval QC	reports	are	re-
3.	Go	back	to	researcher	with	 generated	until	no	errors	
remaining	issues/	 remain.? 
recommendations 	 
4.	Invested	researchers	will	
make	their	own	changes,	
others	overwhelmed	with	the	
task	will	ask	KSD	to	make	
changes	 
	 
YES	
i5K:	Is	a	DOI	
Mint	DOI needed?	If	so,	data	
1.	If	data	are	uploaded	directly	to	ADC,	get	a	DOI	from	the	DOE	 must	also	be	
Interagency	Web	Service,	download	resulting	XML	 submitted	to	ADC 
2.	In	Oxygen,	open	XML	and	remove	unneeded	fields,	save	file	
to	GitHub,	upload	to	DOE	site.	Returns	XML	with	a	DOI	 
3.	Paste	DOI	in	Workbench,	save	in	supervisor	review	mode	to	
check	that	it	resolves	correctly NO	
 25	
	
Review 
	 	
LCA	 
 i5k	 ADC 1.	Independent	peer	reviewer	 Email	discussion	with	
1.	Email	list	of	changes	to	author	for	 with	subject	expertise	reviews	
 depositor	about	any	approval	and	ask	any	questions documentation	and	 unclear	metadata	or	
2.	Authors	resubmit	data	for	review.		If	 representation	of	model,	making	 other	issues 
they	change	anything:	they	email	with	a	 comments	(reviewers	are	 	 
list	of	changes	rather	than	using	a	 network	partners,	some	
modification	note,	but	curators	worry	 academic,	some	government)	 
this	method	won?t	scale	up	when	data	 2.	Curator	reviews	comments	
submissions	increase	 from	reviewers,	but	will	only	relay	
comments	related	to	
documentation	back	to	the	
researcher 
3.	Researcher	revises	if	necessary	
(may	ask	KSD	to	help) 
	 
 26	
	
Publish 
	 	
ADC LCA	 i5k		
Once	approved	by	 1.	Publish	model	to	 1.	Use	Tripal	pages-	software	that	
submitter,	publish	to	 production	site	to	test	for	 incorporates	biodata	schema	and	
Ag	Data	Commons data	loss/	data	collision Drupal,	to	set	up	the	database	
	 2.	Move	model	from	 connecting	different	pieces	of	
production	site	to	live	site data		
3.	Backup	on	GitHub 2.	Use	application	interface	to	
(saving	file	as	received,	file	 index	the	database	in	Blast	and	
as	published,	change	log,	 HMMER	systems:	Add	new	
and	any	other	 organism,	specify	type,	search	for	
documentation) file	on	server	to	upload,	give	
	 description	(based	on	preferred	
format)	
3.	Server	indexes	database,	which	
shows	up	on	Blast	page	for	
querying	
 27	
Analysis	and	Recommendations:	Curating	research	data	
	
The	most	striking	element	of	this	workflow	model	is	the	variation	in	processes	between	systems	
for	working	with	data.		Differences	in	funding,	partnership	models,	and	project	histories	
account	for	many	of	the	differences	in	systems	and	workflows.		The	i5k	project,	for	example,	is	
a	community	effort	of	arthropod	researchers	for	whom	a	shared	database	of	sequenced	
genomes	of	these	animal	species	is	intended	to	improve	research	outcomes	(Poelchau	2015).		
In	contrast,	the	Life	Cycle	Assessment	Commons	supports	assessment	of	?environmental	
impacts	associated	with	all	stages	of	a	product's	life.	[?]	The	goal	of	LCA	is	to	compare	the	full	
range	of	environmental	effects	assignable	to	products	and	services	by	quantifying	all	inputs	and	
outputs	of	material	flows,	and	then	assessing	how	these	material	flows	impact	the	
environment?	(Life	Cycle	Assessment	Commons).		In	addition	to	raw	data	related	to	the	
assessment	of	numerous	agricultural	products,	the	LCA	Commons	contains	the	models	used	by	
researchers	to	assess	the	impact	of	actions	throughout	an	agricultural	product?s	lifecycle.		
Between	i5k	and	LCA	alone,	there	are	a	host	of	differences	in	data	structures,	infrastructure	
requirements,	and	quality	control	methods.		Ag	Data	Commons	(ADC	in	the	model	above)	
contains	a	wider	range	of	data	types,	seeking	to	serve	as	a	catch-all	data	repository	for	
agricultural	researchers.		What	ADC	gains	in	scope,	it	loses	in	the	ability	to	standardize,	which	is	
an	important	consideration	for	the	long-term	usability	of	datasets.		While	omitted	from	the	
workflow	model	due	to	space	constraints,	LTAR,	the	Long-Term	Agroecosystem	Research	(LTAR)	
network	is	a	fourth	data	product	provided	by	KSD,	offering	historical	data	from	18	research	
sites	with	an	average	of	50	years	of	data.		As	a	repository	for	data	over	a	long	period,	it	is	an	
extremely	valuable	resource.		Because	it	focuses	on	observational	data,	geographic	location	is	a	
of	central	importance	to	LTAR,	whereas	location	is	less	significant	for	the	other	data	systems	
managed	by	KSD.	
	
While	separate	systems	for	groups	of	research	data	provide	unique	forms	of	access	to	the	data	
they	contain,	this	does	create	some	redundant	work	in	system	upkeep.		To	the	extent	possible,	
NAL	should	attempt	to	bring	the	systems	together	and	develop	features	that	can	be	accessed	
across	platforms.		For	example,	one	major	goal	of	KSD	is	to	connect	research	data	with	the	
publications	derived	from	that	data.	In	Ag	Data	Commons,	the	metadata	field	?Primary	Article?	
can	accept	a	citation,	DOI,	and	AgID,	facilitating	linkages	between	ADC,	other	NAL	products,	and	
resources	found	elsewhere	on	the	internet.		One	possible	workflow	change	would	involve	
making	an	ADC	record	for	each	new	dataset,	which	KSD	might	then	treat	as	NAL?s	preservation	
copy	of	the	data	and	metadata.		While	the	other	data	products	would	continue	to	offer	
enhanced	functionality	for	using	the	data,	ADC	would	serve	as	the	umbrella	repository	for	
preservation	purposes.	
	
In	the	first	stage	of	the	model,	receipt	of	files,	curators	for	each	of	the	three	systems	open	and	
view	the	materials	they	have	received,	checking	for	issues	in	the	integrity	of	the	files.		While	i5k	
and	LCA	have	well	defined	error	checking	methods	(checksums	and	script	running)	ADC	uses	a	
more	general	review	method,	reflecting	the	broad	range	of	data	types	the	system	accepts.		
Adding	automated	checksums	to	the	submission	process	for	all	NAL	data	systems	would	help	
ensure	that	file	transfer	happens	correctly,	and	increase	the	trustworthiness	of	these	systems.		
 28	
One	notable	shared	feature	of	the	three	workflows	is	the	documentation	of	the	editing	process.		
Whether	done	programmatically,	by	following	a	checklist,	or	with	changes	documented	in	a	
spreadsheet,	KSD	ensures	the	integrity	of	the	data	in	part	through	keeping	a	record	of	
transformations	made.		In	the	review	section	of	the	three	workflows,	only	LCA	uses	peer	review	
to	ensure	the	quality	of	the	submission.		The	other	tools	use	back	and	forth	communication	
with	depositors	to	come	to	an	agreement	about	changes	that	need	to	be	made.		This	is	a	labor	
intensive	method,	as	staff	members	working	on	ADC	noted,	and	they	are	concerned	that	it	may	
not	scale	up	as	submission	volume	increases.		Offloading	that	work	from	staff	to	researchers	
through	peer	review	is	not	necessarily	the	best	option,	since	it	demands	sustained	commitment	
to	the	resource	from	a	community	of	expert	users.		As	it	grows,	ADC	may	want	to	consider	
presenting	data	with	different	levels	of	curation,	including	self-deposited	(no	curation),	minimal	
curation	(curators	have	reviewed	data	and	documentation	for	completeness),	and	peer	
reviewed	(an	expert	reviewer	has	looked	in-depth	at	the	materials).		Once	a	submission	is	ready	
for	publication,	curators	use	a	range	of	manual	and	programmatic	publication	methods.			ADC	
and	LCA	both	have	a	curator	move	a	data	product	into	publication	mode,	while	i5k	uses	
indexing	software	to	make	data	easier	to	query	within	the	system?s	taxonomy.			
	
One	important	aspect	of	data	preservation,	which	KSD	has	configured	in	different	ways	for	
different	data	products,	is	documentation	of	data	in	its	various	states	throughout	curation.	LCA	
uses	a	spreadsheet	template	to	document	changes,	publishing	it	to	GitHub	along	with	the	data	
as	submitted,	as	published,	and	data	documentation.		In	a	sense,	GitHub	is	LCA?s	preservation	
repository	and	the	materials	here	are	a	core	part	of	KSD?s	digital	assets.	The	i5k	and	ADC	
servers	are	the	primary	targets	for	the	digital	preservation	of	those	two	products,	although	KSD	
staff	should	make	careful	note	of	where	they	store	other	materials	and	versions	of	data	
published	to	those	systems?they	likely	also	require	long-term	preservation.	
	 	
 29	
Conclusion:	Recommendations	and	system	diagram	for	digital	
preservation	at	the	library	as	a	whole	
This	section	of	the	report	considers	high	level	infrastructure	and	workflow	changes	that	would	
support	digital	preservation	at	NAL,	bringing	together	the	findings	of	each	section	of	the	report	
with	recommendations	for	building	a	more	robust	suite	of	systems.	
Infrastructure	Recommendations	
The	current	state	of	digital	storage	and	preservation	at	NAL	can	be	described	as	system	back-up,	
lacking	a	preservation	focus.		Through	an	automated	process,	a	member	of	the	ISD	team	
oversees	the	daily	backup	of	about	184	virtual	machines	at	NAL,	storing	approximately	50TB	of	
data.		Using	Veritas	NetBackup	Enterprise	8,	NAL	creates	a	disk-based	snapshot	of	the	virtual	
machines.		On	a	weekly	basis,	the	disk	writes	to	tape,	which	is	stored	offsite	by	the	vendor	JK	
Moving.		There	is	also	a	back-up	copy	of	NAL	data	stored	offsite	and	refreshed	yearly.		The	
technical	gap	between	NAL?s	current	digital	storage	back-ups	and	a	robust	preservation	plan	is	
primarily	in	methods	to	assure	that	the	data	will	remain	stable	and	accessible	over	time.		Best	
practices	for	digital	preservation	suggest	multiple	copies	of	data	on	multiple	servers	in	multiple	
geographic	locations,	with	continuing	automated	fixity	checks	to	ensure	the	authenticity	of	the	
data	over	time	(Philips	et	al.	2013).		While	NAL	uses	two	locations	for	data	storage,	they	lack	the	
recommended	geographic	distribution	and	automated	fixity	checks	between	copies	that	best	
practices	recommend.		A	practical	option	to	address	this	problem	is	membership	for	NAL	in	a	
group	like	the	Digital	Preservation	Network	or	the	LOCKSS	Program	(Lots	of	Copies	Keep	Stuff	
Safe,	hosted	at	Stanford	University	Libraries),	which	provide	robust	systems	for	data	ingest,	
replication,	and	fixity	checks	in	exchange	for	a	membership	fee.	
Some	elements	of	current	NAL	infrastructure	support	digital	preservation,	and	that	capacity	
should	be	used	to	NAL?s	advantage.		The	Fedora	software	supporting	NAL?s	Unified	Repository	is	
one	prominent	example.		Fedora	has	several	features	supporting	digital	preservation	that	have	
been	built	into	more	recent	versions	of	the	software	(Fedora	2018).			By	building	a	version	of	the	
Unified	Repository	using	the	4.7.x	release,	NAL	would	be	able	to	take	advantage	of	the	
persistence,	fixity,	auditing,	and	versioning	features	of	the	software.			The	Unified	Repository	
currently	houses	both	in-house	digitized	materials,	one	portion	of	the	materials	produced	in	
workflow	1	in	this	report,	and	the	publication	metadata	produced	in	workflow	3.		During	NAL?s	
Fedora	installation	upgrade,	it	should	be	considered	as	a	potential	in-house	preservation	system	
for	Internet	Archive	produced	materials,	research	data	storage	(as	a	central	preservation	
solution	for	NAL?s	several	systems),	and	archived	NAL	web	content	as	well.			
Workflow	Recommendations	
A	key	facet	in	the	recommendations	given	in	this	report	is	an	upgrade	to	a	new	Fedora	
installation.		Although	the	Fedora	software	is	open	source	and	has	no	licensing	fee,	the	upgrade	
would	require	significant	labor	costs	to	NAL,	including	earmarking	the	time	of	already	overtaxed	
 30	
members	of	ISD.		While	NAL	tends	to	use	contractors	for	major	technical	infrastructure	
upgrades,	the	library	should	use	caution	in	relying	too	much	on	contractor	labor.		Contractors	
will	need	to	work	closely	with	NAL	staff	members	managing	and	using	the	current	installation	to	
ensure	that	it	is	configured	to	meet	their	work	needs	and	support	workflows	from	many	areas	
of	the	library.		In	particular,	this	report	recommends	several	changes	to	current	workflow	
practices	to	enhance	preservation	within	the	Unified	Repository,	which	a	new	installation	
should	support.	
1. Build	a	new	Unified	Repository	on	a	4.7.x	release	of	Fedora,	taking	advantage	of	the	
software?s	preservation	capabilities	(described	in	the	previous	section).		This	would	
provide	robust	preservation	assurances	for	collections	already	housed	in	the	UR.	
2. Programmatically	add	all	new	and	preexisting	NAL	content	which	resides	in	the	Internet	
Archive	to	the	Unified	Repository.		The	script	currently	in	use	to	upload	new	in-house	
digitized	materials	to	IA	was	written	in	collaboration	between	ISD	and	Internet	Archive	
staff.		A	similar	collaboration	should	take	place	to	write	a	script	that	reverses	the	
process,	providing	NAL	with	its	own	digital	copies	of	Internet	Archive	produced	digital	
objects	and	their	metadata,	through	deposit	in	the	UR.	
3. Research	data	are	not	currently	housed	in	the	Unified	Repository,	but	if	the	repository	is	
re-envisioned	as	a	preservation	environment,	this	should	change.		Because	KSD	is	
working	towards	recognition	and	use	of	Ag	Data	Commons	as	a	disciplinary	data	
repository	for	the	agricultural	research	community,	it	should	begin	that	process	by	
cataloging	and	storing	datasets	from	other	KSD	data	products	into	the	ADC	system.		ADC	
can	serve	as	the	repository	for	all	KSD	products,	with	links	back	to	datasets	in	systems	
like	i5k	and	LTAR	that	support	enhanced	exploration	and	use	of	the	data.		An	(ideally	
automated)	process	should	then	be	used	to	copy	data	to	the	Unified	Repository,	to	
provide	back	up	and	preservation	support	for	the	materials	held	in	Ag	Data	Commons.	
4. Web	archiving	is	not	yet	one	of	the	services	performed	by	NAL,	but	it	should	be	
considered	an	essential	part	of	a	robust	digital	preservation	system.		Using	software	like	
Archive-It	or	Heritrix,	NAL	could	begin	to	package	and	make	accessible	snapshots	of	the	
Library?s	own	websites,	along	with	other	websites	determined	necessary	to	
documenting	agriculture.		Web	materials	managed	by	ARS	and	other	groups	within	
USDA	are	particularly	important	targets	for	web	archiving	by	NAL.	Access	to	web	
archiving	software	and	a	clear	collecting	scope	for	web	materials	will	enable	NAL	to	
collect	this	valuable	part	of	agricultural	history.		The	Unified	Repository	should	be	
configured	to	manage	and	store	archived	websites.	
5. With	the	Unified	Repository	updated	and	configured	to	support	robust	digital	
preservation,	NAL	should	begin	membership	in	a	digital	preservation	consortium,	such	
as	the	Digital	Preservation	Network	or	LOCKSS,	to	ensure	the	long-term	preservation	of	
materials	stored	both	in	the	UR	and	in	other	NAL	systems.			
The	following	diagram	illustrates	the	proposed	relationships	between	current	and	not-yet-
existent	systems	at	NAL.		Note	that	solid	lines	in	the	figure	represent	data	flows	already	in	place,	
while	dashed	lines	represent	proposed	relationships.	
 31	
In-house	
LCA	 LTAR Future	KSD	 digitized	 IA	digitized	Commons projects materials materials
i5K Ag	Data	Commons Web	Publications	 archives
and		
metadata
Ag	Data	
Commons	 Unified	Repository
as	central	 (updated	with	 Preservation	
data	 preservation	 Consortium	
repository capabilities)
 
 
Proposed	Digital	Preservation	System	Diagram	for	NAL	
 
	
Organizational	Recommendations	
NAL	should	establish	a	preservation	working	group	to	be	led	by	ISD	with	participation	from	two	
staff	members	from	each	branch.		For	each	unit,	beginning	with	the	findings	around	current	
practices	as	discussed	in	this	report,	the	staff	members	from	that	unit	should	review	and	verify	
the	findings	given	here,	augmenting	them	with	their	knowledge	of	materials	that	require	
preservation	and	key	workflow	steps,	if	missing	from	these	models.		This	effort	should	result	in	
an	inventory	of	digital	collections:	a	spreadsheet	that	captures	information	on	digital	
preservation	needs	at	the	collection	level,	rather	than	at	the	object	level.		The	inventory	process	
is	described	in	the	Digital	Preservation	Workflow	Curriculum	compiled	by	the	Digital	
Preservation	Network	(2018).		Reviewing	this	curriculum	as	a	group	would	be	a	valuable,	yet	
accessible,	learning	exercise	for	members	of	NAL?s	preservation	working	group,	helping	them	
begin	with	a	shared	understanding	of	digital	preservation	and	how	to	achieve	it.		As	listed	in	
Module	2	of	the	training	on	the	topic	of	selection,	fields	captured	in	a	digital	collections	
inventory	should	include:		
? ?Collection	title/s	
? Location/s	of	content		
o On	which	server	or	network	drive?	On	external	hard	drives,	DVDs,	or	CDs?	In	
what	box?	On	which	shelf?	In	which	room?	
 32	
? Agents	responsible	for	creating	the	collection	(e.g.,	donor,	digital	collections	
department)	
? Agents	responsible	for	curating	the	collection	(e.g.,	archivist,	digital	preservation	
librarian)	
? Content	stream	(e.g.,	born-digital,	digitized)	
? Format	
? Number	of	files	
? Size	of	collection	(in	bytes)	
? Collection	creation	date/s,	date	of	initial	inventory,	event-related	dates	
? Agent	responsible	for	inventorying	collection	
? Assessment	information?	
(Digital	Preservation	Network	2018,	Module	2,	Slides	17-18)	
The	assessment	step	is	particularly	vital	here?rather	than	simply	asking	which	collections	NAL	
has	on	various	media,	the	group	should	determine	the	long-term	value	of	collections.		In	
preservation	efforts,	the	team	should	focus	on	those	materials	of	high	value	to	NAL	
stakeholders.	
Through	a	combination	of	technical,	workflow,	and	organizational	changes,	NAL	is	poised	to	
provide	robust	preservation	of	digital	objects.		By	focusing	sustained	attention	and	resources	on	
this	important	challenge,	NAL	will	be	able	to	confidently	meet	its	mandate	to	safely	steward	
valuable	agricultural	information.	
 	
 33	
Appendix	1	Project	Charter	
November	16,	2016	
	
Morgan	Daniels	
Postdoctoral	Fellow	for	Digital	Preservation	
University	of	Maryland	College	Park	
mgd@umd.edu	
	
This	charter	describes	a	project	examining	digital	preservation	workflows	across	departments	at	the	
National	Agricultural	Library.		In	accordance	with	the	OneNAL	vision	for	the	library?s	future,	the	research	
team	at	the	University	of	Maryland	iSchool	(namely	Ricky	Punzalan,	Morgan	Daniels,	Katie	Gucer,	and	
Adam	Kriesberg)	perceives	digital	preservation	as	a	concern	that	transcends	departmental	distinctions	
within	an	organization.		One	digital	preservation	infrastructure	with	accompanying	workflows,	can,	and	
should,	serve	the	library	regardless	of	the	unit	responsible	for	particular	content.			
	
In	order	to	help	NAL	streamline	the	digital	preservation	process,	the	team	(lead	in	this	particular	effort	
by	Postdoctoral	Fellow	Morgan	Daniels)	proposes	an	assessment	of	current	digital	preservation	activities	
across	the	library.		Using	a	Contextual	Inquiry*	approach,	Daniels	will	sit	down	with	individual	members	
of	each	unit	in	the	library	to	learn	about	their	current	digital	preservation	work,	encompassing	born-
digital,	digitized,	and	web-hosted	database	collections.		By	observing	individuals	as	they	work	in	their	
normal	context	(at	their	own	workstation)	and	asking	questions	as	the	work	proceeds,	Daniels	will	gain	
an	understanding	of	each	person's	practices,	combining	those	practices	into	models	that	will	create	an	
understanding	of	digital	preservation	across	the	library.		A	number	of	questions	can	be	addressed	
through	this	process,	including	how	does	information	flow	through	each	department	in	the	process	of	
storing	and	saving	digital	materials	for	the	long	term?		What	blockages	exist,	and	how	might	they	be	
repaired?		How	can	the	work	be	reconfigured	to	make	people?s	jobs	easier?		These	questions	can	be	
answered	using	Contextual	Inquiry	methods,	which	consist	of	two-hour	meetings	with	individuals	from	
various	departments	whose	work	encompasses	or	interacts	with	digital	preservation	activities	in	some	
way.		Meetings	will	begin	with	a	brief	set	of	interview	questions,	followed	by	an	observation	period	
during	which	Daniels	will	ask	individuals	to	walk	her	through	their	storage	and	preservation	workflows,	
asking	questions	about	their	work	along	the	way.		She	will	ask	participants	for	their	permission	to	audio	
record	the	meetings,	while	assuring	that	workflows	that	will	be	discussed	but	individuals	will	not	be	
identified	in	reports.		
	
By	combining	the	results	of	numerous	such	interviews,	Daniels	will	be	able	to	create	a	big	picture	view	
of	digital	preservation	across	NAL	while	retaining	the	smaller	differences	between	activities	in	different	
units	of	the	library.		The	research	will	culminate	in	a	report	back	to	NAL,	with	specific	recommendations	
for	redesigning	and	improving	preservation	workflows,	and	in	journal	publication(s).	
	
The	greatest	barrier	to	conducting	a	Contextual	Inquiry	project	within	an	organization	is	getting	buy-in	
across	units.		In	order	for	this	proposed	research	to	succeed,	Daniels	will	need	assistance	from	the	heads	
of	each	departmental	unit	in	securing	the	participation	of	staff	members	across	NAL.			
	
	
	 	
 34	
Timeline	
Date	 Activity	
November	2016	 Project	charter	authored,	negotiated,	and	agreed	upon	by	the	
UMD	and	NAL	teams	
December	1-15	2016	 Morgan	Daniels	will	develop	specific	study	methods,	including	
semi-structured	interview	protocol,	recruitment	text,	and	
contextual	inquiry	methods	
December	15-30	2016	 UMD	team	will	update	Institutional	Review	Board	(IRB)	
documents	to	reflect	new	study	procedures,	working	with	the	
University	of	Maryland	IRB	to	assure	that	the	study	continues	to	
meet	human	subjects	protections	
January	1-March	31	2017	 Morgan	Daniels	will	recruit	participants	for,	and	perform	
contextual	inquiry	interviews	with	two	or	more	individuals	in	
each	NAL	unit,	creating	a	workflow	diagram	for	each	individual	
interview.		While	the	individual	interview	recordings	and	
models	will	not	be	made	available	to	NAL	leadership	(to	protect	
participant	privacy),	consolidated	models	illustrating	activity	
within	each	unit	will	be	created	during	the	next	project	period	
April	1-June	31	2017	 Daniels	will	analyze	data	during	this	period,	combining	
individual	workflow	models	into	one	consolidated	model	for	
each	unit	and	an	overall	model	for	the	entire	library	
July	2017	 Daniels	will	author	a	report	to	NAL	describing	the	study?s	
findings	
August	2017	 Morgan	Daniels?	appointment	concludes,	unless	extended	
	
	
Responsibilities	
Morgan	Daniels,	Postdoctoral	Fellow	in	Digital	Preservation	at	UMD,	will	design	and	implement	the	
study,	with	input	from	Ricky	Punzalan,	Adam	Kriesberg,	and	Katie	Gucer.		She	will	collect,	manage,	and	
analyze	study	data,	derived	primarily	from	Contextual	Inquiry	interviews.		She	will	be	lead	author	on	the	
report	to	NAL	and	on	publications	and	presentations	related	to	this	work.	
	
NAL	unit	leaders	will	assist	the	study	by	participating	in	Contextual	Inquiry	sessions,	suggesting	potential	
participants,	and	reaching	out	to	staff	members	within	their	unit	to	request	their	participation.	
	
	
About	Contextual	Inquiry	
Contextual	Inquiry,	a	data	collection	technique	associated	with	user	experience	research,	was	developed	
by	Karen	Holtzblatt	and	Hugh	Beyer.		As	a	Graduate	Student	Instructor	at	the	University	of	Michigan,	
Daniels	taught	this	technique	to	Master's	students	and	guided	their	project	teams	as	they	worked	to	
solve	an	organization's	information	problem.	This	project	will	use	techniques	described	in	the	book		
Rapid	Contextual	Design:	A	How-to	Guide	to	Key	Techniques	for	User-Centered	Design		
by	Karen	Holtzblatt,	Jessamyn	Burns	Wendell	and	Shelley	Wood.			
ISBN:	978-0-12-354051-5	
 35	
Appendix	2	Interview	Protocol	
The	goal	of	these	interviews	is	to	learn	about	digital	storage	and	preservation	related	work	ongoing	at	
the	National	Agricultural	Library,	across	all	units	and	departments.		Information	gained	from	these	
interviews	will	inform	recommendations	for	a	digital	preservation	program	at	NAL.	
	
Demographic	
1. Please	state	your	official	title	and	scope	of	responsibilities	at	NAL	
a. Probe	for	contractor	status,	length	of	service	
Extent	of	digital	materials	
2. Please	describe	the	types	of	digital	materials	you	deal	with	in	your	work	at	NAL	(including	?born	
digital?	materials,	digital	representations	of	analog	material,	and	web	hosted	databases).	
a. What	is	the	general	extent	of	each	type?	
b. What	kinds	of	growth	patterns	are	you	seeing	annually?	
Digital	workflows	
3. Please	walk	me	through	your	workflow	for	each	type	of	material,	showing	me	the	steps	you	take	
with	each	type	of	material.		
a. How	does	the	material	reach	you?	
b. In	what	ways	do	you	process	it?	
i. Probe	specifically	for:	reformatting,	adding	or	changing	metadata,	changing	
storage	location	and	specific	tools	and	software	used	
c. What	happens	to	it	after	you	have	processed	it?	
d. Where	does	it	get	stored?	Is	there	backup	storage/	a	redundant	copy	made?	
e. What	challenges	arise	during	your	work	receiving,	processing,	and	storing	digital	
materials	(if	any)?	
f. What	would	make	this	process	work	better	for	you?		For	your	colleagues?	
4. What	metadata	standards	are	most	relevant	for	your	work	with	digital	objects?	
5. What	are	the	most	important	aspects	of	the	digital	materials	you	work	with	to	preserve	for	the	
long	term?	
NAL	digital	context	
6. (If	applicable)	What	projects	have	you	previously	worked	on	at	NAL	related	to	digital	objects?	
7. What	formal	and	informal	training	have	you	received	which	prepared	you	for	work	with	digital	
materials?	
a. Probe	for	grad	school,	learning	from	others	at	NAL,	what	opportunities	would	you	like	to	
see	at	NAL,	access	to	training	opportunities	(finding,	time	off,	etc)	
8. What	do	you	see	as	the	challenges	on	the	horizon	for	NAL	and	USDA	around	digital	assets?		For	
the	agricultural	research	community	more	broadly?	
a. Prompt	for	digital	preservation,	safekeeping,	backups,	archiving,	sustainability	
9. Who	else	should	I	talk	to	about	these	issues,	specifically	people	in	your	unit	or	people	involved	
in	your	workflow	for	these	materials?	
	 	
 36	
References	
	
Digital	Preservation	Workflow	Curriculum.		Retrieved	from	http://dpn.org/members	(in	the	Best	
Practices	section	of	the	site)	
	
Fedora	(2018).	Fedora	and	Digital	Preservation.		Retrieved	from	
http://fedorarepository.org/fedora-and-digital-preservation		
	
Holtzblatt,	K.,	Wendell,	J.	B.,	and	Wood,	S.	(2005).		Rapid	Contextual	Design:	A	How-to	Guide	to	
Key	Techniques	for	User-Centered	Design.		Morgan	Kaufmann	Publishers,	San	Francisco,	CA.	
Kriesberg,	A.	(2016).		NAL	Digital	Curation	Plan.		Beltsville,	Maryland.	(Internal	report,	not	
published.)	
Life	Cycle	Assessment	Commons.	(n.d.)	Life	Cycle	Assessment.		Retrieved	from	
https://data.nal.usda.gov/life-cycle-assessment	
Phillips,	M.,	Bailey,	J.,	Goethals,	A.,	Owens,	T.	(2013).	The	NDSA	Levels	of	Digital	Preservation:	
Explanation	and	Uses.		Proceedings	of	the	Archiving	(IS&T)	Conference,	April	2013,	Washington,	
DC.		Retrieved	from:	http://ndsa.org/documents/NDSA_Levels_Archiving_2013.pdf		
Poelchau,	Monica,	et	al.	"The	i5k	Workspace@	NAL?enabling	genomic	data	access,	
visualization	and	curation	of	arthropod	genomes."	Nucleic	acids	research	43.D1	(2015):	D714-
D719.	
Rieger,	T.	ed.,	(2016)	Technical	Guidelines	for	Digitizing	Cultural	Heritage	Materials:	Creation	of	
Raster	Image	Files.	Federal	Agencies	Digital	Guidelines	Initiative	(FADGI)	Still	Image	Working	
Group.		September	2016.		Retrieved	from	
http://digitizationguidelines.gov/guidelines/FADGI%20Federal%20%20Agencies%20Digital%20G
uidelines%20Initiative-2016%20Final_rev1.pdf	
	
	
	
	
 37