Semantic Web – A Survey
Advanced Analysis and Algorithms
Dr. Awais Adnan
Roll No 3
Institute of Management Science
Semantic Web –
It is a very big challenge for the
next coming years to find what has been searched and requested effectively and
efficiently. To find the exact information, a normal user spends much time on
it. On this problem Semantic Web Mining is very helpful. By using Semantic Web
and Web Mining both will improve mining by using semantics and to generate semantics
by mining. Both areas will help in making the web more meaningful and semantic.
Analyze the results provided by different search engines like Google, Bing and
Yahoo, the personalized access to the information available on the Web is
required (Svatopluk et al., 2005). As of 2008, the estimated size of the web’s
portion accessible by different search engines was almost one trillion pages (Google
Blog). The sheer scale of the web, together with its decentralized, highly inessential
and largely inexact nature, makes using the knowledge within rather unmanageable.
Moreover, the relevant knowledge can be throw across many resources, which provide
the attempts to make use of all the accessible content even more complicated.
problem is mostly referred to as “information overload”. To some extent,
the problem has been address by advanced technologies based on the field of
information retrieval, which power the current web search engines and make finding
of related resources easy. The resulting information overload problem is being
faced by many technologies drawing inspirations from various fields of computer
science. Likely the most influential field in this context is information
retrieval, most visibly experienced in the form of web search engines like
Google (http://www.google.com,), Yahoo (http://www.yahoo.com) or Bing (http://www.bing.com).
The information retrieval methods cover real portion of the web content, but they
only oat on the surface of the real meaning of the data they index due to their
dependence on mere stringbase The Semantic Web try to accompaniment the rather superficial
information retrieval approach by adding meaning to the strings of the web
content with the statistics and heuristic ranking. In the next sections, we will
start with a small overview of the areas
of Semantic Web and Web Mining. After that section, we will discuss an overview
of challenges and future trends in the Semantic Web implementation.
Semantic Web is about providing
meaning to the data from different kinds of web resources to allow the machine
to interpret and understand these enriched data to precisely answer and satisfy
the web users’ requests . Semantic Web is a part of the second generation web
(Web2.0) and its original idea derived from the vision W3C’s director and the
WWW founder, Sir Tim Berners- Lee. According to Semantic Web represents the extension
of the World Wide Web that gives users of Web the ability to share their data
beyond all the hidden barriers and the limitation of programs and websites
using the meaning of the web. Overviews
of various emerging technologies of semantic web are given below.
(formerly DAML-S) is a services ontology ,within the
OWL-based framework of the Semantic Web, that provides
software agents to discover, invoke, create, and monitor Web resources.
OWL 2: OWL
2 increase the Web Ontology Language (OWL) with a useful but small set of
features. OWL 2 ontologies
provide data values, properties, individuals, and classes and are stored as
Semantic Web documents.
WSMO: Web Service
Modeling Ontology OR WSMO is a conceptual model for similar characteristic related to Semantic Web
Services. it provides the automation of invoking, joining,
and discovering electronic services over
WSML: Web Service Modeling Language or WSML provides a formal
syntax and semantics for the WSMO (Web Service Modeling Ontology). it consists
of several variants, such as WSML-Rule, WSML-DL, WSML-Flight, WSML-Core, and
Semantic Web Rule Language aims to be the Semantic Webs standard rule language
and is based on a combination of the OWL DL, OWL Lite, RuleML and so on.
constitutes a modular family of Web sublanguages including derivation rules,
queries and integrity constraints as well as production and reaction rules.
goal of Rule Interchange Format (RIF) is to be the standard rule language of
the Semantic Web for Rule Interchange.
1: Semantic Web Architecture
meaning of linguistics is studied by its specific sub discipline, semantics.
The meaning is analyzed at the level of sentences, phrases, words, and larger
units of discourse. Signs are the basic subjects of study in semantics, which
may be understood as discrete units of meaning (words, images, gestures,
scents, tastes, textures, sounds, etc., essentially all forms of a message in
which information can be transferred by the participants in a communication
process). Two major discrete conceptions of signs have been suggest by two key figures
required in the birth of the modern linguistics:
Dualistic signs: According to
Saussure, a sign is composed of the signifier and the signified.
former is conceived as a language demonstration of a conceivable and/or obtainable
entity or idea, while the latter is the mental demonstration or a concept of
the entity or idea that is being signified. The requisite between the signifier
and signified in a sign is completely arbitrary.
Signs as triadic relations: The idea of a stable
relationship between a signifier and its signified is rejected by Peirce. Departing
from language-based motivations, he introduced a idea of sign motivated largely
by philosophical logic. His main focus was on proposing a theory of production
of meaning instead of a theory of language itself. The result is the idea of
sign that initiate meaning by recursive
relationships between three sets, corresponding to three basic semiotic components:
Representamen: The symbolic
representation of the denoted thing, object or idea.
Object : The object being represented by the sign.
Interpretant: The meaning of the sign,
represented by yet another sign decided by the process of interpretation.
relations between three sets of semiotic components present the ways how the meaning of a sign is linked
with its actual representation in the language & in the world. The basic tools employed in the
investigations, which focus for lexical semantics, are lexical relations like antonymy,
synonymy , hyponymy or hyperonymy. The meaning of lexical units is usually decided
in a top-down way by human experts
(lexicographers) after studying relevant language resources. The meaning itself
is establish by empirical analysis of various general patterns appearing
between words in the large scale data sets. The approach of distributional, or statistical
semantics is essentially a bottomup and can be automates to large extent.
of the meaning of single words or phrases is only the first step towards
studying the semantics of more complicated natural language structures like
sentences. The meaning of a sentence is analyzed by parsing it into its
syntactic tree first. Then components of the parse tree are transformed into a
logical form, which is in turn used for the sentence’s logical analysis by
means of associated truth conditions.
is a very interesting research topic
which combines Data Mining and World Wide Web, two of the activated research areas. The World Wide Web is a fruitful
area for data mining research because
large amount of information is available online. The Web mining research
relates to research communities of database, information retrieval, and AI. The
World Wide Web (Web) is a popular and interactive medium to spread information
today. The Web is huge and diverse and thus raises the scalability and multimedia
data respectively. Oren Etzioni first coined
the term Web mining in his paper in 1996. Etzioni starts by making a hypothesis
that the information on the Web is acceptably
structured and outlines the sub tasks of Web mining and extend the Web mining
processes. The Web data mining can be defined as the finding and analysis of useful
information from the World Wide Web data.
AND FUTURE TRENDS
Web provides new challenges to the traditional data mining algorithms that will
work on flat Data. Some of the traditional data mining algorithms have been
extended and new algorithms have been used to work on the Web data. With volatile
growth of the information resources available on the WWW (World Wide Web), it
has become growingly necessary for users to utilize automatic tool in order to locate
the required information resources, and to track their usage patterns. These
factors give rise to the necessity of to creating server-side intelligent
systems and client-side intelligent
systems that can adequate
mine for knowledge. The analysis of huge web log files is a complicated task
not fully addressed by existing
access analyzers. However, it is hard to find exact tools for analyzing raw web
log data to retrieve important and useful information. There are some
commercially available web log analysis tools, but most of them are not liked
users and considered very slow, inflexible, cost effective, difficult to
maintain and very less in the results they can provide.
of the tools using data mining techniques to provide web log analyses are being
created, the research is still in its infancy. The current techniques for analyzing
web usage resources have different drawbacks, for example, either large storage
requirements, excessive I/O cost, or scalability problems when some more
information is introduced into the analysis.
and maintaining web based information systems, such as Web sites, is a great
challenge. On the Web, it is much easier to find inconsistent source of
information than a well structured site. There is a important relation between
structured documents, i.e, Web sites and a program; the Web is a great
candidate to experiment with some of the technologies that have been developed
in area of software engineering.
mining is a very new and quickly developing research and application area. With
more collaborative research across different disciplines like database, AI,
statistics and marketing, we will be capable to development web mining websites
and applications that are very helpful and useful to the web based information
systems. In recent years Web Mining has been an important topic in data mining
research from the standpoint of supporting human-centered uncovering of
knowledge. The current day model of web mining effected from a number of
shortcomings as listed earlier. As services over the web continue to rise,
there will be a continuing need to make them fast, robust, scalable and