Skip to main content

APACHE STANBOL

Apache Stanbol - Website  - https://stanbol.apache.org/docs/trunk/tutorial.html

Stanbol helps to model a semantic relationship around NLP. Given a document it can find the main concepts like NER and gives link to these entities into DBPedia or Enterprise database.

The steps to follow to use Stanbol :

1) Use RESTFul aPI
2) Use Java API

Using RestFul API
----------------------------------

Step 1: export MAVEN_OPTS="-Xmx1024M -XX:MaxPermSize=256M"
Step 2 : svn co http://svn.apache.org/repos/asf/stanbol/trunk stanbol
Step 3:  mvn clean install (From downloaded stanbol directory)
Step 4: java -Xmx1g -jar stable/target/org.apache.stanbol.launchers.stable-{snapshot-version}-SNAPSHOT.jar (give your corresponding stanbol jar name)
Step 5 : Open http://localhost:8080in web browser
Step 6 : The stanbol options are available now. For ex. enhancer we can use as we click on that and give a text , we will get the corresponding NERs and its related DBPedia links.

Otherwise Step 7 : curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \ --data "The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley." \ http://localhost:8080/enhancer

We will get the results.


Java API :
----------------
We can download and integrate Apache Stanbol Client API into Java from
https://github.com/zaizi/apache-stanbol-client .


after downloading the file and unzipping import into eclipse as java maven project. The we can use the enhance from the code below :

public class Sample {

public static void main(String[] args) throws StanbolServiceException, StanbolClientException  {
    Sample sample = new Sample();
    sample.SimpleContentEnhancement();
}

public void SimpleContentEnhancement() throws StanbolServiceException, StanbolClientException{
    final StanbolClientFactory factory = new StanbolClientFactory("http://localhost:8080");
    final Enhancer client = factory.createEnhancerClient();
    EnhancerParameters parameters = EnhancerParameters.
                builder().
                buildDefault("Paris is the capital of France");
    EnhancementStructure eRes = client.enhance(parameters);
    eRes.getBestAnnotations();

    for(TextAnnotation ta: eRes.getTextAnnotations()){
        System.out.println("********************************************");
        System.out.println("Selection Context: " + ta.getSelectionContext());
        System.out.println("Selected Text: " + ta.getSelectedText());
        System.out.println("Engine: " + ta.getCreator());
        System.out.println("Candidates: ");
        for(EntityAnnotation ea:eRes.getEntityAnnotations(ta))
              System.out.println("\t" + ea.getEntityLabel() + " - " + ea.getEntityReference());
    }
}
}



(U can refer to the actual documents in this link : -
https://github.com/zaizi/apache-stanbol-client )


The above pgm will give the output as : -



 

 


 




Comments

Popular posts from this blog

Coursera Course 3 Structuring Machine Learning Projects

Week One - Video One - Why ML STrategy Why we should learn care about ML Strategy Here when we try to improve the performance of the system we should consider about a lot of things . They are: -Amount of data - Amount of diverse data - Train algorithm longer with gradient descent -use another optimization algorithm like Adam -  use bigger network or smaller network depending out requirement -  use drop out - add l2 regularization - network architecture parameters like number of hidden units, Activation function etc. Second Video - Orthogonalization Orthogonalization means in a deep learning network we can change/tune so many things for eg. hyper parameters to get a more performance in the network . So most effective people know what to tune in order to achieve a particular effect. For every set of problem there is a separate solution. Don't mix up the problems and solutions. For that, first we should find out where is the problem , whether it is with training ...

Libraries For ML Projects in Python

Top machine learning libraries for Python 1. Numpy Numerical Python It is the most fundamental package for scientific computing in python. It provides operations for matrix and array. Numpy arrays are used in most of the ML projects. The library provides vectorization of mathematical operations on the NumPy array type 2. Scipy modules for linear algebra, optimization, integration, and statistics. It contains modules for linear algebra, optimization, integration, and statistics. 3. Pandas It works with labelled and relational data.  It designed for quick and easy data manipulation, aggregation, and visualization. Here is just a small list of things that you can do with Pandas:     Easily delete and add columns from DataFrame     Convert data structures to DataFrame objects     Handle missing data, represents as NaNs     Powerful grouping by functionality 4. Matplotlib Used for  generation of simple and powerful visual...