Skip to main content

APACHE STANBOL

Apache Stanbol - Website  - https://stanbol.apache.org/docs/trunk/tutorial.html

Stanbol helps to model a semantic relationship around NLP. Given a document it can find the main concepts like NER and gives link to these entities into DBPedia or Enterprise database.

The steps to follow to use Stanbol :

1) Use RESTFul aPI
2) Use Java API

Using RestFul API
----------------------------------

Step 1: export MAVEN_OPTS="-Xmx1024M -XX:MaxPermSize=256M"
Step 2 : svn co http://svn.apache.org/repos/asf/stanbol/trunk stanbol
Step 3:  mvn clean install (From downloaded stanbol directory)
Step 4: java -Xmx1g -jar stable/target/org.apache.stanbol.launchers.stable-{snapshot-version}-SNAPSHOT.jar (give your corresponding stanbol jar name)
Step 5 : Open http://localhost:8080in web browser
Step 6 : The stanbol options are available now. For ex. enhancer we can use as we click on that and give a text , we will get the corresponding NERs and its related DBPedia links.

Otherwise Step 7 : curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \ --data "The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley." \ http://localhost:8080/enhancer

We will get the results.


Java API :
----------------
We can download and integrate Apache Stanbol Client API into Java from
https://github.com/zaizi/apache-stanbol-client .


after downloading the file and unzipping import into eclipse as java maven project. The we can use the enhance from the code below :

public class Sample {

public static void main(String[] args) throws StanbolServiceException, StanbolClientException  {
    Sample sample = new Sample();
    sample.SimpleContentEnhancement();
}

public void SimpleContentEnhancement() throws StanbolServiceException, StanbolClientException{
    final StanbolClientFactory factory = new StanbolClientFactory("http://localhost:8080");
    final Enhancer client = factory.createEnhancerClient();
    EnhancerParameters parameters = EnhancerParameters.
                builder().
                buildDefault("Paris is the capital of France");
    EnhancementStructure eRes = client.enhance(parameters);
    eRes.getBestAnnotations();

    for(TextAnnotation ta: eRes.getTextAnnotations()){
        System.out.println("********************************************");
        System.out.println("Selection Context: " + ta.getSelectionContext());
        System.out.println("Selected Text: " + ta.getSelectedText());
        System.out.println("Engine: " + ta.getCreator());
        System.out.println("Candidates: ");
        for(EntityAnnotation ea:eRes.getEntityAnnotations(ta))
              System.out.println("\t" + ea.getEntityLabel() + " - " + ea.getEntityReference());
    }
}
}



(U can refer to the actual documents in this link : -
https://github.com/zaizi/apache-stanbol-client )


The above pgm will give the output as : -



 

 


 




Comments

Popular posts from this blog

Coursera Course 3 Structuring Machine Learning Projects

Week One - Video One - Why ML STrategy Why we should learn care about ML Strategy Here when we try to improve the performance of the system we should consider about a lot of things . They are: -Amount of data - Amount of diverse data - Train algorithm longer with gradient descent -use another optimization algorithm like Adam -  use bigger network or smaller network depending out requirement -  use drop out - add l2 regularization - network architecture parameters like number of hidden units, Activation function etc. Second Video - Orthogonalization Orthogonalization means in a deep learning network we can change/tune so many things for eg. hyper parameters to get a more performance in the network . So most effective people know what to tune in order to achieve a particular effect. For every set of problem there is a separate solution. Don't mix up the problems and solutions. For that, first we should find out where is the problem , whether it is with training ...

Converting DICOM images into JPG Format in Centos

Converting DICOM images into JPG Format in Centos I wanted to work with medical image classification using Deep learning. The Image data set was .dcm format. So to convert the images to jpg format following steps have performed. Used ImageMagick software. http://www.ofzenandcomputing.com/batch-convert-image-formats-imagemagick/ Installed ImageMagick in Centos by downloading the rom and installing its libraries : rpm -Uvh ImageMagick-libs-7.0.7-10.x86_64.rpm rpm -Uvh ImageMagick-7.0.7-10.x86_64.rpm After installation the the image which is to be converted is pointed in directory. Inside the directory executed the command: mogrify -format jpg *.dcm Now dcm image is converted to JPG format.