Document type sign travel agency agreement ohio later
LARS VILHUBER: Hello. Today we will be hearing
from Josh Hawley who will be talking to us about
Ohio and the Longitudinal Data Archive. The Ohio Longitudinal
Data Archive was established in 2007. In the last five years, there
have been 28 published studies that have used data access
through the Ohio Longitudinal Data Archive-- OLDA. The OLDA's primary
research focuses on outcomes of
education training, but it also engages
with researchers on human services, housing,
and health care as need arises. It's a collaboration between the
Ohio state government and Ohio State University to
make longitudinal data from multiple state agencies
available for research. And it's an example,
especially in the context of this handbook, of a robust
institutional partnership for researchers
and data providers looking to work with data and
to launch their own data center. Josh Hawley is a professor
at the John Glenn College of Public Affairs at
the Ohio State University. He also serves in
leadership roles at two research centers at OSU-- director of the Ohio
Education Research Center and associate director
for the Center for Human Resource Research. And with that, I invite you
to listen to Josh's talk. JOSHUA HAWLEY: Hi. My name's Josh Hawley,
and I'm a professor here at the Ohio State University. I work at the John Glenn
College of Public Affairs, and I have a joint
appointment also in the Center for Human Resource
Research in Arts and Sciences as well as directing something
called the Ohio Education Research Center here
at The Glenn College. In that role with
the university, I have worked with state of
Ohio for close to 20 years in different capacities. In the most recent
decade, the work has extended to developing a
data system called the Ohio Longitudinal Data Archive. And that's the
subject of the webinar and the subject of the
chapter I wrote for the book. I'm going to start
off by telling you a little bit about what
we're going to do today. I'm going to talk with you
a little bit about some motivating questions, a
little bit about the basics behind the OLDA, our current
data holdings, some example projects, some things I'm
calling rules of the road, which are really about
governance and data use, and then offer just some initial
final thoughts about for you if you're developing
your own system or thinking about
systems that you use. So the first thing
I want to note is just to talk a
little bit about what it means to be a faculty
member in this day and age. When I started in
university as a professor, I had a couple of years of
experience as a consultant and a few years of experience
as a middle school teacher. And I don't think the middle
school teaching really helped much until I had my own kids. But the consulting
experience was important in that
it made me somewhat pessimistic about
the role of a faculty member in a traditional
university sense. I wasn't much interested
in traditional publishing, and I actually thought
I would try teaching in a university for a few
years and see what happens. But it gives you some
critical questions about government itself. What do you want to
do when you actually want to be a participant in
government in the activity, and what role does
the university bring, particularly a state
university, to this process? So to that end then
we have built a system at the Ohio State
University that allows us to engage
with government and provide government
the technical architecture through data
management that they can interact with university
faculty in a regularized way. Some basic facts about the
OLDA, because I think it's important just to start
with some sense of what we have done. This project exists
and has existed for 10 to maybe 15 years. It's hard to actually
date the founding. It's a collaborative
project, and it is a signed agreement
between the university and a number of state agencies. It stores data. That's one of the main
reasons it exists-- to store data at the university
on behalf of state agencies, and then
those data are available for external
and internal use. So in other words,
internal to the state and then external to the
state in terms of university or other partners. And the long-term goal
here is to generate evidence-based research used by
both researchers and government and improve public policy. What that can mean
in the long run really depends on your
questions that you have as a state agency and
your interests as a researcher. It's not particularly
well-suited for what we might call basic research. It is much more suited for
research to practice, research and action,
partnership research, any of those kinds of
research we tend to focus on. The OLDA is a
historical evolution of my activities at the
university and others at OSU. So, prior to 2007,
there was just a lot going on in different
parts of the university, in the state, in terms of
partnerships with faculty. I was doing some
work on remediation then work on adult
vocational education. All of that required data. Another faculty member
had done a lot of work on welfare reform, some
kind of tracking data, and there were a bunch of
independent education projects that involved
prominent researchers from around the country. So at the same time,
there was a lot of this technical
development at the state. So, due to
changes in technology, the state of Ohio
had bought and built different data systems for its
K through 12 and its higher ed data, then it had legacy
systems on UI claims, unemployment insurance
claims, and those data systems came together at different
points to be useful--or not. At the same time, there
were exemplar systems in states like Florida that
brought these data together in more important
and useful ways. So you have this
research is increasingly requesting the data. You have new data
systems that exist, and you've got exemplary
architecture systems out there. Along comes a lot of
federal investment, and the federal investment was
critical in a number of areas. So, the workforce data
quality initiative was a Department
of Labor, and still is a Department of
Labor, project in Ohio. And among other states had
some significant investment that we helped direct here. The ARRA funding,
which was the funding that the federal
government put into place after the last
recession, was critical. It provided a good deal of
money to states in exchange for very direct policy action. Some of which required
new data systems, and we again just rode that
wave and helped the state put that system together. And in many cases, during
this period of time you saw very rapid
expansion of integrated data use across the states. So a lot of states
saw the genesis of their longitudinal data
systems in this period of time. It wasn't simply the
federal investment. There was just an accumulation
of effort and interest on the part of researchers
in our state government. And I don't think we're
special in this respect. I think a lot of states
benefited from this. Many of the systems were very
internally focused, however, and I think that is
somewhat different here in that they were
driven by the need to integrate data in states,
not to give researchers access to some data. So at the end of all this
federal investment, of course, there was a transition. We had really a
sustainability transition, where we had to figure out
how to stay in business if we wanted to because it is
a soft money operation, just like most faculty kind
of research projects are. And we also had to see
what our priorities would be in the future. So we made that transition
to additional funding. The state has been
very supportive. So in addition to facing kind
of a sustainability challenge, we also had new questions
that came to mind. And those questions required
that we think about integrating data across states. And that's been a
labor of love, I would like to say, for the
last four or five years, is with our partner
states in the Midwest trying to figure out how
to share labor market, education,
unemployment insurance data across the states to
be able to better measure the effects of different
state and federal policies. And we've done an
enormous amount of work with our colleagues at
the Coleridge Initiative, which is an independent
nonprofit that works on this area. We've also profited
from better technology. So one of the realities
in the post-2012 era was cloud-based storage
and cloud-based systems that make use of FedRAMP
compliant data systems in both our state and the
Coleridge Initiative have migrated to cloud-based
storage to an extent. And that's an important
technical lesson that we have. Ohio-- just kind of
highlighted two areas here. One is the university role. Ohio provides an example of
what happened during the era and other universities
have similar stories. I think that prior
to 2008, we really profited from having
some outstanding faculty in various departments that
were using administrative data. And these were
primarily upon request from the state or the
federal government or from private organizations. And one early way that that
was all brought together was something called ADARE. ADARE was a project that Steve
Wander and David Stephenson-- Steve was at DOL. And Dave Stephenson
was at University of Baltimore, Maryland. And they brought
the states together to really link training and
labor data in unique ways across-- within states, but in
parallel questioning. And that was one of the first
kind of cross-state labor, Education, workforce
projects I was involved in in a significant way. And there were a lot
of kind of advantages of these early projects. One I want to flag, that's
also in the literature quite a bit around data systems
and vocational education, which is one of my substantive
areas of interest, is that when you work with the
state or federal government, having what might be
called the circulation of human capital
is really critical. So having people who go from
the government to the university to a nonprofit to a
business in different roles but have connections and can see
the value of working together. It's quite important. And so, at a certain point,
we had a former labor market information systems
director come to OSU and work in the Center, CHRR. And that was critical
because she knew a great deal about the data systems that
we did not and knew also which questions were relevant
in the state and federal level. And I soaked that up like
a sponge, that connection. I think you also just
have to be patient. I like to-- when
younger people ask me about getting into data
systems work, I say 10 years. Just you got to invest at least
10 years to build something. And anybody who wants to
do something more quickly is dreaming. So that's just one lesson. The second lesson is
about the relationship to federal investment. And again, there's a fairly
good public administration literature on this. But federal
investment is critical to both deepen and quicken
program outcomes in the states. And so the fact that the
federal government had big data programs, like the
workforce data quality initiative or the
state longitudinal data system at the K through
12 level, which I've not talked about
previously, is really critical for building legal
and administrative frameworks for sharing data. So we have these systems. The federal government invests. And it meant that different
states could get access to resources, but also the
technical assistance that developed from
the state systems. And so for a while,
on the WDQI side, we met regularly in
Washington or in other states. States would call us for advice. We would travel when we could
travel, and it wasn't COVID. And we would provide guidance. And it was fairly informal. There wasn't a great
deal of structure. I also really liked
the informality of technical assistance. I think it grows really well when
people can call upon experts that they trust. And those experts don't
have a stake in their game at their state level. Race to the top, which was
the era funding equivalent, is the accelerator
of all accelerators. It was an enormous
amount of money. Some of which was spent on data
systems in specific states. And I think it really did
seed a lot of work locally here in Ohio that we
could do and allow us to experiment in many ways. So the data systems
that we have available, we have data from the Department
of Higher Ed, the Department of Education, Department
of Job and Family Services, which is a large agency
that has several divisions, Opportunities for Ohioans
with Disabilities, which operates the Vocational
Rehabilitation Services, and the Ohio Housing
Finance Agency. And what you have
in front of you is basically a map of a recent
or recent map of data holdings. And you can see both the
periodicity of the data. And the number of files we
get in specific agency's vary quite a bit. So for example, we are
very much up to date on unemployment insurance
claims because of the crisis. We get pretty much everything
as quickly as we possibly can. In addition, we're getting some
really new data on homebuyers from Housing Finance Agency. And then a lot of
the files are yearly, so they still show 2019 because
we haven't gotten our 2020 data pull. So you have a fair bit of
variation in the data system. But there are all available
as individual level files that researchers
can apply to access. And if they can frame
their project well enough for the state audience and if it
can pass IRB and other security restrictions, then
there is a possibility of getting some of this data
linked together by our staff, because that's one of
the services we provide. And then you can do a lot
of very interesting projects. Some of which I'll go
over in a few minutes. So there's a couple
of facts here that I just wanted to highlight. They're like, to
me, greatest hits in some ways about data systems. Administrative
data is what it is. It's a layer cake. You can't change it. It's not survey data. The variables are not created
for your specifications or to your liking. They change over time. They're missing. And you just have to
accept that reality. Two specific examples, one
is education credentials and other are
occupational codes. Both of which are a really
critical input or output for the education system. We don't get to decide what
education credentials the state government codes based
on data, nor do we get to decide what the
occupational structure of the labor market is. Both of those are
imposed from the outside. So when researchers say to me, I
would like my 16 Baskin-Robbins flavors of ice
cream today, please, all of which have different
characteristics, I would say, you can't. You can have these two, vanilla
and chocolate, and that's it. And you can like it
or you can move on. But there's nothing I can
do about what's available. The second thing is,
metadata is really important. I cannot tell you how often I
talk to people who are computer scientists who are-- and they kind of
have this belief, you can wish into existence a
knowledge about data systems, variables and values by
just kind of dumping it into a kind of a spaghetti map. That's not-- that's
not the way knowledge about data systems exists. What's called the data
generating process, how the data come to be,
the administrative systems, the educational programs, the
work force programs, they are-- there's a lot of
knowledge that's baked in that you need to
access to be able to interpret the data correctly. And so, one of the
most important things we do with researchers
is we provide them with some of that
content knowledge. You need to have the human
capital, the people in place, and you need to pay for
those people over time as part of a research effort. And you can't simply
write it all down. It will not work. Data maintenance is critical. Data changes, it's not a
shiny new toy all the time, it's stored in different
kind of systems. The great example from
New Jersey this year, where they had to put like
an all points bulletin out for any COBOL
programmers to work on their ancient
unemployment claims data, it illustrated the fact
that data systems are old and have been neglected, and
government does not normally update them the way
industry would update, kind of, new toys. And I would say, private
and public agencies are not any different
than private firms. I've had the same conversation
about COBOL programmers with bank technologists
from Columbus. They still use COBOL for a
lot of their back end systems at banks. So just because a data system
is old doesn't mean it's bad, but you do have to account
for that maintenance cost, and account for the maintenance
human capital needs over time. You can't simply
hire people or have staff that know the new stuff. The other thing that's
kind of a key fact here is questions,
questions, questions. Everything is about what's
important to the government at the time, which means
you as a researcher have to compromise. That-- I know that's a shocker
for many of us in academia, and it's the one thing that
I alluded to at the beginning that I was most excited about. I actually, I have interests
but I don't care what-- the way I ask things can
be framed in a way that is useful to government. And so we receive
many requests that seem very much out of
left field to government, because they haven't been
couched in a policy term. They've been couched in the
language of econometrics, or kind of
quantitative sociology, the two principal disciplines
we receive applications from. When they need to go
talk to a legislator, they need to go talk
to a government affairs person in their
university, and they need to reframe
what they're asking, that's not my job as
the operator of the OLDA to reframe your
questions for you. And over time,
we've gotten better at forcing researchers to
process that and come up with questions that
are on their end, better, so that our
clients and the state can understand those questions. And this is only going
to become more important as we work with more
cross-state data, because nobody's going
to be able to understand an esoteric kind of
econometrics question, just unless it's framed in
a common way across states. So some example studies,
we've done both-- we've done an enormous number
of different kinds of studies for the state in addition
to funding researchers to do them or researchers
approaching us. All that stuff happens. So just some examples,
and you have some images in front of you on the screen. We've done quite a bit of
work on students dropping out from high school early on
during the post or during era, and after era. STEM has been an increasingly
important interest over time, both STEM at the
high school level, but also STEM at
the post-secondary. And I would say, increasingly,
the focus is turning to kind of the labor market,
strict labor market sides, so the unemployment, unemployment
claims, transition of particular groups
into the labor market. We've done a recent kind of RCT,
randomized controlled trial, with kind of, youth
who are having trouble accessing the labor
market throughout Ohio. So there's different kinds
of projects that we've done. And I think there's also been-- we've kind of,
again, because we're trying to be useful to
government and not simply use data that exists in government
for our own research purposes, we've gotten a lot of
requests for dashboards. And so we've become
quite adept at both the technical dashboard,
we're using tools, just web development tools, or
tools like Shiny, or more increasingly,
something like Tableau, which provides for quicker
rapid cycle development. You just have two, in front
of you there's plenty more. One is called the
Workforce Success Measures. The Workforce
Success Measures came about as a request from the
governor's office eight, nine years ago, I think. And at that point, it
was a new administration from John Kasich,
and a requirement that he wanted to
manage the, kind of, different programs
from the Workforce-- WIOA Act, the Workforce Act. At the federal level, he
wanted to have a dashboard that allowed him to see
the common outcome, to calculate what are called
the common measures for each of those programs, which include
things like, how much money did people make two quarters
after they finished a program? Four quarters after
they finished a program? Were they retained in a job? And those kinds of metrics,
just comparison across programs. And so doing the computational
work on the dashboard was complicated, but
also building that tool that would meet the needs of
the county executives, the state agencies, the governor's office. That's the thing we do. And we're actually now on
our third or fourth, I think, generation of that
project this year. We have a fairly tight timeline
to rebuild it for the spring. And then we have a
lot of work we've done on employment and
educational projections with the Department of Higher
Education and Job and Family Services, which operates a site
called Ohio Means Jobs, which provides job seekers, and
really anyone in Ohio, with access to a
series of tools, one of, some-- several of which
are tools we've developed. One here is unemployment
projections. So you can go in and look
at what jobs are available and what fields, and where
they are, how much they pay, those kinds of good questions. And it's linked up to all
this really important data that we maintain. The last kind of substantive
area I want to talk about is what I call
rules of the road. So this can be really daunting
to set one of these systems up, and as I confessed
at the beginning, I didn't know we were setting
this up 15, 20 years ago when I started doing this work. But having done this,
I have some suggestions for those of you-- now I'm speaking to government
officials around the world, or academics--who are trying
to set up these things. Pay attention to governance. It is the place you
will get more, be able to get more out
of these investments, and it's also the
place that you're most likely to fall down on it. And I didn't-- I didn't build this framework,
I borrowed much of it from other states like
Washington and Florida, and also from my experience,
oddly enough, doing work internationally, international
organizations like UNESCO and the UNDP, and others where
I've done data work in part of education projects. So my sense of
governance is that there needs to be a policy
level governance. There needs to be
somebody who's in charge, talking to somebody else who's
in charge at another agency, talking about data. And so we have something
called the policy council. It advocates for the system. It kind of sets agendas, it
supervises me, basically. And they supervise kind of
decision funding and data decisions. And we've met
pretty much every-- at this point, every
quarter, I would say, for the last 8 to 10 years. And so we have a
lot of time together that allows us to then pursue
other efforts collectively and be a trusted partner. We also have something
called a coordinating board. This is just a
contractual relationship. You need to have a
group that actually can get together and approve
invoices, approve project budgets, and serve as the
most direct point for me to be accountable to
the state government. And then we have a third
layer, it's something called data stewards,
which in a lot of systems can be separated by domain. These are the technical
staff in each agency. And in our sense, this can
get quite big sometimes, because each agency may have
two or three analysts working on the data set that
we're interested in, and sometimes those data
sets are more complicated. So what I would
suggest to you, is to think about the
policy level, the kind of financial, operational
level, and the data level, at very minimum. And on the data side, you might
also have subset that nowadays into content and security,
because both are important. Just to give you a
little bit of how that fits into what we do here
at Ohio State, we have the-- our mission, really, because
we're a university center, is-- so we have this project, the
OLDA, and then I-- this sign by my head this entire time
I'm talking intentionally, because the OLDA's a
project of a research center that's a faculty generated
operation at Ohio State. It exists as a funded project. I created the effort at
a certain point, and-- myself and others,
and I have a staff that works together on kind of
bigger stuff than just the data system. So we have more at stake. And so the data
system work fits into, kind of, best practice work. It fits into, kind of
what might be called, stakeholder engagement with
educators, policymakers. And we also develop,
when we can, kind of dashboard
scorecards, materials, we have a whole kind of
best practice repository we've built on educational
practices for the Ohio Department of Education. So there's an enormous array
of things that we can do, and the data system
work is part of it. It's a key part. My vision, and I've said this
very transparently in a number of places, is that we're kind of
creating cutting edge knowledge and resources for Ohio, for
educators, policymakers, and community leaders. It's a cycle. I mean, I think
policy implementation, policy evaluation are
part of, kind of, cycles. And getting the best
outcomes for the children and for workers in
Ohio, who are the-- mostly the subjects of the
data, requires evaluation, it requires guest speaking,
best practice work, engagement with
community leaders. We did a fascinating
project that's about to be released on
the future of smart work with the city of
Columbus that represented a lot of engagement activity
with the city and the community leaders in the city. And I think at the end of the
day, that knowledge improves, and that's an academic goal,
but it's also a societal goal. So just finally as
a faculty member, I want to just encourage
you to constantly remake-- remake your own job. There's a diminishing
number of us. You have to be very
careful and intentional about what will make an impact. And I think the danger of COVID
in a very real way for faculty life, is it isolates
you more and more. I mean, I'm in the office
today teaching this afternoon in person, which is a rarity
these days, at least in Ohio. But I think it's an
important point to make, that people can be
engaged with students, engage with policymakers. But you have to be
very intentional, you have to not let yourself
be isolated in your study, doing data work with-- on your own. It makes it really hard. So I do think
COVID kind of makes it clear that the traditional
recipe, mix research, service and teaching is-- needs to be rethought
fundamentally. Particularly at land
grant institutions, all research is research
to practice and partnership research. And I think lastly, our specific
lessons we have gained here at Ohio State and
other places can be translated to other
countries and universities. And I really look forward to-- I have a sabbatical coming
up in Southeast Asia, and I'm working on some of these
issues on data and labor data, and data utilization in
some of those contexts. So I look forward to dialoguing
with you next week when this video airs,
and you'll see here ways to contact me and
my staff, and don't hesitate to reach out if you're
seeing this on broadcast later. Thank you so much
for your attention.