Concise Range Queries System

Published on Feb 28, 2025

Abstract

We present a query formulation language in order to easily query and fuse structured data on the web. The main novelty of MashQL is that it allows people with limited IT-skills to explore and query one (or multiple) data sources without prior knowledge about the schema, structure, vocabulary, or any technical details of these sources. More importantly, to be robust and cover most cases in practice, we do not assume that a data source should have -an offline or inline- schema. This poses several language-design and performance complexities that we fundamentally tackle.

To illustrate the query formulation power of MashQL, and without loss of generality, we chose the Data Web scenario. We also chose querying RDF, as it is the most primitive data model; hence, MashQL can be similarly used for querying relational databases and XML. We present two implementations of MashQL, an online mash up editor, and a Firefox add-on. The former illustrates how MashQL can be used to query and mash up the Data Web as simple as filtering and piping web feeds; and the Firefox add on illustrates using the browser as a web composer rather than only a navigator. To end, we evaluate MashQL on querying two datasets, DBLP and DBPedia, and show that our indexing techniques allow instant user-interaction.

EXISTING SYSTEM:

Before formulating a query, one has to know the structure of the data and the attribute labels (i.e., the schema). End-users are not expected to investigate “what is the schema” each time they search or filter information. In many cases, a data schema might be even dynamic, i.e., many kinds of items with different attributes are often being added and dropped. Other sources might be schema-free, or if it exists.

DISADVANTAGES OF EXISTING SYSTEM:

· Everybody must have a knowledge about the process

· Content would not display if the keyword not matching

PROPOSED SYSTEM:

We propose an interactive query formulation language, called MashQL. The novelty of MashQL (compared with related work) is that it considers all of the above assumptions together. Being a language -not merely an interface and, at the same time, assuming data to be schema-free is one of the key challenges addressed in the context of MashQL design and development. Without loss of generality, this article focuses on the Data Web scenario. We regard the Web as a database, where each data source is seen as table. In this view, a data mash up becomes a query involving multiple data sources. To illustrate the power of MashQL we chose to focus on querying RDF, which is the most primitive data model, hence, other models -as XML and relational databases - can be easily mapped into it.

ADVANTAGES OF PROPOSED SYSTEM:

· No need of prior knowledge about the database data

· Keyword not needed for searching and query will make as per user assumption type d content

MODULES:

· Data base design

· Query Formulation Algorithm

· Select the query subject

· Select a property

· Add an object filter

· Query Language

· Graph Signature Index

· Implementation and Evaluation

MODULES DESCRIPTION:

Data base design

In this module we are going database design in order to create table in the data base design.

Query Formulation Algorithm

This algorithm is used by the MashQL editor. Its novelty is that it one to navigate through and query a data graph(s) without assuming the end-user to know the schema or the data to adhere to a schema.

Select the Query Subject

That is, after specifying the dataset, users can select S from a dropdown list that contains, either: (i) ST: the set of the subject-types in G, such as Article; or (ii) SI: the union of all subject and object identifiers in the dataset; or (iii) a user-defined subject label. In the latter case, the subject is seen as a variable (S Î V) and displayed in italics; the default subject is the variable label anything.

Select a Property

Depending on the chosen subject(s) in step 1, a list of the possible properties for this subject is generated (Figure 6.B). There are four possibilities: (i) if (S Î ST), such as Article, the list will be the set of all properties that the instances of this subject-type have (e.g., Title, Author, Year). (ii) if (S Î SI), such as A1, the list will be the set of all properties that this particular instance(s) has. (iii) If the subject is a variable (S Î V), the list will be the set of all properties in the dataset. (iv) users can also choose the property to be a variable by introducing their own label.

Add an Object Filter

There are three types of filters the user can use to restrict P: a filtering function, an object identifier, or a query path. (i) A filtering function can be selected from a list . (ii) If a user wants to add an object identifier as a filter, a list of the possible objects will be generated. For example, if a user previously chose Any Article as a subject, and Author as a property, the list of the object identifiers would. The following formalizations specify what the list of object identifiers may contain.

Query Language

The notational system and constructs that make MashQL an expressive and yet intuitive query language, supporting all constructs of SPARQL.

Graph Signature Index

Because of assumption (data is schema-free), the previous algorithm has to query the whole dataset in real-time, which can be a performance bottleneck because such queries may involve many self-joins. Hence, the interactivity of MashQL might be unacceptable. Thus, we propose a new way of indexing RDF, which we call the Graph Signature. The size of a Graph Signature is typically much smaller than the original graph, yielding fast response-time queries.

Implementation and Evaluation

We present two implementations of MashQL: a server-side mash up editor, and a Firefox add-on extension. We evaluate the response-time of MashQL on two large datasets: DBLP and DBPedia; and compare it with Oracle’s Semantic Technology. We will show queries can be answered instantly, regardless of the data size.