Review Paper 1

Research and reviews in question answering system, 2013

Sanjay K Dwivedi, Vaishali Singh

Introduction

The majority of Information Retrieval Systems leaves users to extract useful information from an ordered list. Users are presented with a list of relevant documents in the quest for an accurate answer.

Challenges

One of the challenges of existing QA systems is to understand the natural language questions correctly and deduce the precise meaning to retrieve exact responses.
A proper validation process is required so that the answer deduced from the system is perfect.

Three Stages

Question Analysis
1. Parsing
2. Question Classification
3. Query Formulation
Document Analysis
1. Extract candidate documents.
2. Identity Answers
Answer Analysis
1. Extract candidate answers
2. Rank the best one.

Knowledge needed for solving generic QA systems:

Artificial Intelligence
Natural language processing
Statistical Analysis
Pattern matching
Information Retrieval
Information Extraction (similar to point 5)

Approaches by various systems:

Linguistic Approach
Statistical Approach
Pattern Matching Approach

Linguistic Approach (LA)

This approach of knowledge representation is based on the production rules( similar to TOC), logics, frames, templates, ontologies,semantic networks which is analysed during QA pair analysis. Tokenization, POS tagging and Parsing are some of the techniques which are used in LA. Queries are formulated in a precise way so that they can ask for response to structured databases only. Building structured knowledge bases is a time consuming process hence this approach is used in long term information needs for a particular domain.

Disadvantages:

Less portable for different grammar and mapping rules
time consuming

Statistical Approach

This approach is useful for available online text repositories and web data. It also deals with large amounts of data and heterogeneity present in it. It can also create a query formulation in natural language form. It requires decent amount of data for precise statistical learning. Some of the statistical approaches which are successful are, SVM's, Bayesian Classifiers and maximum entropy models that have been used for question type classification. Statistical techniques which are used in answer finding task in QA are N-gram mining, sentence similarity models and Okapi similarity.

Disadvantages:

Fails to identify linguistic features for combination of words and phrases
Treat each term independently.

Pattern Matching Approach

This approach uses expressive power of text patterns. This approach is considered as a simple approach and is quite favourable in small and medium sized websites. PMA is of two types:

Surface Pattern Based

Either hand crafted or automatic patterns are used through examples

The answers are extracted using statistical techniques or data mining measures.

Answers obtained are not in formatted form.

Template Based

A template is preformatted for questions where entity slots are dynamically filled

Uses structured query to extract answer for database.

The answers obtained are in formatted form.

Future Scope:

There are systems available in either of the approaches or in combination of the two but if all approaches are hybridised then it can define innovation in the field of QA systems. QA systems are all about faster speed, increased relevancy, and higher precision and Recall measures.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review Paper 1

Research and reviews in question answering system, 2013

Sanjay K Dwivedi, Vaishali Singh

Introduction

Challenges

Three Stages

Knowledge needed for solving generic QA systems:

Approaches by various systems:

Linguistic Approach (LA)

Disadvantages:

Statistical Approach

Disadvantages:

Pattern Matching Approach

Future Scope:

Clone this wiki locally