Have you ever heard two people describe the same thing but come up with two very different descriptions? Discrepancies can cause problems. And when it comes to gathering data (and analyzing that data), construction engineer David Jeong says discrepancies in language present huge roadblocks.
“You want to use the term ‘highway,’ I want to use the term ‘roadway,’” Jeong uses this example to illustrate language discrepancy. “Are we talking about the same thing or different things?”
Now, Jeong and his fellow researchers at Iowa State University (ISU) are attempting to add clarity.
This month, the National Science Foundation (NSF) awarded almost $300,000 for Jeong and his team to begin their project “A Natural Language Based Data Retrieval Engine for Automated Digital Data Extraction for Civil Infrastructure Projects.”
ISU’s Evgeny Chukharev-Hudilainen is a co-principal investigator for the project.
“Natural language is full of discrepancies between form and meaning of linguistic units,” Chukharev-Hudilainen said. “Developing a system that [will disambiguate] natural language is a hard task, but also crucial for analyses that rely on natural-language data sources.”
The three-year project would aim to develop a system that could grab information from various civil infrastructure data terminologies and quickly sort through that information. Data terminologies could be engineering design manuals, highway data or construction specifications.
“We have started to collect a huge amount of data, no matter which industry sector you are in,” Jeong said. “The construction industry is not an exception. And collecting this digital data using various digital devises has become very economical.”
But as the collection increases, so does the need to analyze the ever-growing data mass.
“We have collected a lot of data, but we don’t know how to dig into this data and extract meaningful information,” Jeong explained.
Jeong, who is an associate professor in ISU’s Department of Civil, Construction and Environmental Engineering (CCEE), outlined several goals of his research team. The team will explore ways to make machines or computers recognize a user’s search intentions from his or her natural language request or query. To do this, the team will develop an algorithm that could quickly recognize the most relevant text or numerical data entities in a user’s request. The project would include testing the algorithm using civil infrastructure text documents: design manuals, guidelines, and technical specifications.
Jeong says a major goal for developing the algorithm is to get information to contractors, designers and project owners quickly and efficiently.
“We should be able to take advantage of this growing amount of digital data to make changes on how we can smartly deliver our projects,” Jeong said.
Jeong is the principal investigator for this project. Evgeny Chukharev-Hudilainen of ISU’s Department of English and Stephen Gilbert of ISU’s Department of Industrial and Manufacturing Systems Engineering are both co-principal investigators. The NSF Division of Civil, Mechanical and Manufacturing Innovation funds this grant.
To learn more, visit Jeong’s website. Make sure to keep up-to-date on ISU CCEE research online via ccee.iastate.edu, Facebook, Twitter, and on Iowa State University Civil, Construction and Environmental Engineering LinkedIn and ISUConE LinkedIn.