Codinko- Java Coding Tutorials

Author: admin (page 1 of 10)

Elasticsearch scroll API

https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-request-body.html#request-body-search-scroll

Request Body Search

Specifies search criteria as request body parameters.

GET /twitter/_search
{
    "query" : {
        "term" : { "user" : "kimchy" }
    }
}

HTTP GET and HTTP POST supported?

Both HTTP GET and HTTP POST can be used to execute search with body. Since not all clients support GET with body, POST is allowed as well.

  • In case we only want to know if there are any documents matching a specific query, we can set the size to 0 to indicate that we are not interested in the search results.

About terminate_after

  • Also we can set terminate_after to 1 to indicate that the query execution can be terminated whenever the first matching document was found (per shard).
GET /_search?q=message:number&size=0&terminate_after=1

The response will not contain any hits as the size was set to 0.

hits.total

The hits.total will be either

  • equal to 0, indicating that there were no matching documents, or
  • greater than 0 meaning that there were at least as many documents matching the query when it was early terminated.

    Also if the query was terminated early, the terminated_early flag will be set to true in the response.

{
  "took": 3,
  "timed_out": false,
  "terminated_early": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped" : 0,
    "failed": 0
  },
  "hits": {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

The took time in the response contains

  • the milliseconds that this request took for processing,
  • beginning quickly after the node received the query,
  • up until all search related work is done and
  • before the above JSON is returned to the client.
  • This means it includes the time spent waiting in thread pools, executing a distributed search across the whole cluster and gathering all the results.

Scroll

Q) search vs scroll

 

  • While search request returns a single “page” of results,
  • the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database.

Q) What is the scroll use-case?

  • Scrolling is not intended for real time user requests, but rather
  • Scrolling is intended for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration. or while using a reconciliation of data between clusters.

Q) Impact of scroll on newer data 

  • The results that are returned from a scroll request reflect the state of the index at the time that the initial search request was made, like a snapshot in time.
  • Subsequent changes to documents (index, update or delete) will only affect later search requests. ( not the scroll requests)

 

Let’s see a scroll request in action 🙂 – scroll/search-context/ scroll_id

In order to use scrolling,

the initial search request
  • the initial search request should specify the scroll parameter in the query string,
  • which tells Elasticsearch how long it should keep the “search context” alive (see Keeping the search context alive),
  • eg ?scroll=1m.
POST /twitter/_search?scroll=1m
{
    "size": 100,
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    }
}

The size parameter allows you to configure the maximum number of hits to be returned with each batch of results.

Passing _scroll_id in scroll API for subsequent results

The result from the above request includes a _scroll_id, which should be passed to the scroll API in order to retrieve the next batch of results.

POST /_search/scroll 
{
    "scroll" : "1m", 
    "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==" 
}

 

  • Each call to the scroll API returns the next batch of results until there are no more results left to return,
  • ie the hits array is empty.

 

Few points to note regarding GET/POST/URL/scroll parameter/scroll_id parameter

  • GET or POST can be used and
  • The URL should not include the index name — this is specified in the original search request instead.
  • The scroll parameter tells Elasticsearch to keep the search context open for another 1m.
  • The scroll_id parameter is the one passed as part of scroll_id

 

Notes regarding _scroll_id / request-with-aggs/requests-sort-order-as_doc:

  1. The initial search request and each subsequent scroll request each return a _scroll_id. While the _scroll_id may change between requests, it doesn’t always change — in any case, only the most recently received _scroll_id should be used.
  2. If the request specifies aggregations, only the initial search response will contain the aggregations results.
  3. Scroll requests have optimizations that make them faster when the sort order is _doc. If you want to iterate over all documents regardless of the order, this is the most efficient option

 

More details if you are interested to read on :

Keeping the search context alive

A scroll returns all the documents which matched the search at the time of the initial search request. It ignores any subsequent changes to these documents. The scroll_id identifies a search context which keeps track of everything that Elasticsearch needs to return the correct documents. The search context is created by the initial request and kept alive by subsequent requests.

The scroll parameter (passed to the search request and to every scroll request) tells Elasticsearch how long it should keep the search context alive. Its value (e.g. 1m, see Time units) does not need to be long enough to process all data — it just needs to be long enough to process the previous batch of results. Each scroll request (with the scroll parameter) sets a new expiry time. If a scroll request doesn’t pass in the scroll parameter, then the search context will be freed as part of that scroll request.

Normally, the background merge process optimizes the index by merging together smaller segments to create new, bigger segments. Once the smaller segments are no longer needed they are deleted. This process continues during scrolling, but an open search context prevents the old segments from being deleted since they are still in use.

Keeping older segments alive means that more disk space and file handles are needed. Ensure that you have configured your nodes to have ample free file handles. See File Descriptors.

Additionally, if a segment contains deleted or updated documents then the search context must keep track of whether each document in the segment was live at the time of the initial search request. Ensure that your nodes have sufficient heap space if you have many open scrolls on an index that is subject to ongoing deletes or updates.

To prevent against issues caused by having too many scrolls open, the user is not allowed to open scrolls past a certain limit. By default, the maximum number of open scrolls is 500. This limit can be updated with the search.max_open_scroll_context cluster setting.

 

You can check how many search contexts are open with the nodes stats API:

GET /_nodes/stats/indices/search

 

Clear scroll API

Search context are automatically removed when the scroll timeout has been exceeded. However keeping scrolls open has a cost, as discussed in the previous section so scrolls should be explicitly cleared as soon as the scroll is not being used anymore using the clear-scroll API:

DELETE /_search/scroll
{
    "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Multiple scroll IDs can be passed as array:

DELETE /_search/scroll
{
    "scroll_id" : [
      "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==",
      "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAABFmtSWWRRWUJrU2o2ZExpSGJCVmQxYUEAAAAAAAAAAxZrUllkUVlCa1NqNmRMaUhiQlZkMWFBAAAAAAAAAAIWa1JZZFFZQmtTajZkTGlIYkJWZDFhQQAAAAAAAAAFFmtSWWRRWUJrU2o2ZExpSGJCVmQxYUEAAAAAAAAABBZrUllkUVlCa1NqNmRMaUhiQlZkMWFB"
    ]
}

All search contexts can be cleared with the _all parameter:

DELETE /_search/scroll/_all

The scroll_id can also be passed as a query string parameter or in the request body. Multiple scroll IDs can be passed as comma separated values:

DELETE /_search/scroll/DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==,DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAABFmtSWWRRWUJrU2o2ZExpSGJCVmQxYUEAAAAAAAAAAxZrUllkUVlCa1NqNmRMaUhiQlZkMWFBAAAAAAAAAAIWa1JZZFFZQmtTajZkTGlIYkJWZDFhQQAAAAAAAAAFFmtSWWRRWUJrU2o2ZExpSGJCVmQxYUEAAAAAAAAABBZrUllkUVlCa1NqNmRMaUhiQlZkMWFB


Types of Spring Dependency injection and what to use

Injection types

There are three options for how dependencies can be injected into a bean:

  1. Through a constructor.

    @Component
    public class MyComponent {
        private final Trade t;
    
        @Autowired
        public MyComponent(Trade t){
           this.t = t;
        }
    }
  2. Through setters or other methods.

    – When you use @Autowired on the setter method.

  3. Through reflection, directly into fields.

    – when you use @Autowired directly on your field

    @Component
    public class MyComponent {
        @Autowired
        private Trade t;
    }

Injection guidelines

A general guideline, which is recommended by Spring (see the sections on Constructor-based DI or Setter-based DI) is the following:

  • For mandatory dependencies or when aiming for immutability, use constructor injection
  • For optional or changeable dependencies, use setter injection
  • Avoid field injection in most cases

Why Field injection is to be avoided?

The reasons why field injection is discouraged are as follows:

  • You cannot create immutable objects, as you can with constructor injection
  • Your classes have tight coupling with your DI container and cannot be used outside of it
  • Your classes cannot be instantiated (for example in unit tests) without reflection. You need the DI container to instantiate them, so your test is not actually a ‘unit’ test, but integration test!
  • It is really easy to have like ten dependencies. If you were using constructor injection, you would have a constructor with ten arguments, which would signal that something is wrong. But you can add injected fields using field injection indefinitely. Having too many dependencies is a red flag that the class usually does more than one thing, and that it may violate the Single Responsibility Principle.

Summary

  • Depending on your needs, you should primarily use constructor injection or some mix of constructor and setter injection.
  • Field injection has many drawbacks and should be avoided.

Reference: https://stackoverflow.com/questions/39890849/what-exactly-is-field-injection-and-how-to-avoid-it

 

lightweight framework and Why Spring is called a lightweight framework

Why Spring is called a lightweight framework

  • Spring is considered lightweight compared to traditional Java EE applications.
  • It is lightweight because it allows minimally invasive development with POJOs.
  • We consider Spring to be lightweight when we are comparing to normal J2EE container.
  • It is lightweight in the sense of extra memory footprint for the facilities provided (e.g. Transaction Control, Life Cycle, Component dependency management)
  • Spring calls itself ‘lightweight’ because you don’t need all of Spring to use part of it. For example, you can use Spring JDBC without Spring MVC.
  • Spring provides various modules for different purposes; you can just inject dependencies according to your required module. That is, you don’t need to download or inject all dependencies or all JARs to use a particular module.
  • Whether it is “lightweight” or “heavyweight”, it is all about comparison. We consider Spring to be lightweight when we are comparing to normal J2EE container. It is lightweight in the sense of extra memory footprint for the facilities provided (e.g. Transaction Control, Life Cycle, Component dependency management)
  • However, there are sometimes other criteria to compare for the “weight” of a container, e.g. intrusiveness in design and implementation; facilities provided etc.
  • Ironically, Spring is sometimes treated as heavy weight container when compared to other POJO-based container, like Guice and Plexus.
  •  “lightweight” is mostly a buzz-word. It’s meaning is highly subjective and based on context. It can mean “low memory footprint”, it can be low execution overhead, low start-up overhead. People also use it to differentiate between some perceived level of complexity and/or learning-curve. In any case, it’s assuredly relative as there is no defined point on any scale where “light” becomes “heavy” in terms of “weight”.

What is a Lightweight Web Application

For this we need to understand a heavyweight web application.

  • The primary focus of J2EE is for developing distributed and loosely coupled middleware applications
  • Those applications typically have a web front end and a relational database backend. As web applications become popular in recent years, J2EE has gained wide acceptance among developers. In fact, J2EE is one of the few dominant web application platforms today.
  • However, as J2EE is more widely used, developers are also increasingly frustrated by its complexities.
  • The original design of J2EE addresses the distributed computing problems and application scalability problems that only the biggest enterprises encounter.
  • Apart from deviating from the original design reason, Most developers use J2EE to develop small to middle sized web applications.
  • Those “enterprise features” not only are of limited use but also add unnecessary complexity to otherwise simple web applications.
  • The problems have grown as J2EE has evolved and added more features to address a wider range of enterprise use cases.
  • In order to make enterprise Java more appealing to majority of developers who work with small web applications, the Java community has made major efforts to simplify Java web and middleware frameworks.
  • The enterprise Java specification after J2EE 1.4 is called Java EE 5.0. This change highlights the significant changes lightweight frameworks brought to the enterprise Java standards.

The J2EE 1.4 specification, released in 2004, is the last version of heavyweight J2EE.

What is a typical JEE application?

  • A typical J2EE web application has servlets that take in user input and generate the response by displaying JavaServer Pages.
  • The servlet delegates business operations and database-related work to an EJB (Enterprise JavaBeans) module containing session bean objects and entity bean objects.
  • The stateless session bean typically contains transactional methods to perform business operations.
  • Those methods are exposed to the servlet.
  • The session bean makes use of entity beans to access the relational database.
  • In the XML configuration files, we define how to use container services (e.g., transactions and security) from the session beans, as well as how the entity beans are mapped to database tables.
  • Figure 1.1, “Architecture of a J2EE 1.4 web application” shows the above-described architecture
  • Since the EJB managed components cannot be serialized out of the container, we have to use value objects to exchange data across the layers (i.e., method call parameters and return values).

Architecture of a J2EE 1.4 web application

All the J2EE components work together to serve a common purpose”

  • to make the application more scalable.

    Below is a list of key architectural characteristics that makes J2EE great.

    Component-based architecture:

    J2EE components are advanced forms of Java objects. Each component completely encapsulates its own code, configuration, and outward interface. The entire application can be composed from a set of reusable components. Those components can reside on a single computer or on a network of computers to improve application scalability. As we will see in the next several bullet items, components make the application simpler and easier to maintain.

    Loose coupling between components:

    When J2EE components make method calls against another component (e.g., the servlet calls a method in the session bean), the caller component never instantiates the callee component. Instead, the caller requests a reference (or “stub”) to the callee from the container using the callee’s interface. The container manages all the object creation and the components are only coupled by interfaces. The benefit is that when we change a component implementation, the effect of the change would not ripple out of the component itself as long as we keep its published interface stable.

    Shared services provided by the container:

    The J2EE container provides common application services, such as transaction management, network connection pools, access security, data persistence, and logging, to the components. That allows the application to focus on the business logic.

    Declarative configuration:

    In J2EE, you can simply configure how the container service is delivered to your components using XML configuration files. The use of metadata to configure services reduces the clutter in the code and makes J2EE components easier to maintain.

    A complete Object Relational Mapping (ORM) solution:

    You can use J2EE entity beans to model application data stored in a backend relational databases. That allows you to operate on Java objects instead of dealing with the relational model via SQL statements. The details of vendor-dependent SQL statements are generated by the container and are completely transparent to the application developer.

What is a Lightweight Framework

  • The “lightweight” component approach was originally proposed to counter the “heavyweight” approach in EJB 2.1
  •  EJB 2.1 is the core business component framework in J2EE 1.4, and it is notoriously hard to use
  • EJB3 is core to Java EE 5.0, and is an industry-wide effort to standardize a lightweight framework for Java enterprise developers.
  • EJB3 looks almost completely different from EJB 2.1, yet it captures the flexibility and power of the J2EE way.
  • (a)  POJOs as Components
  • (b) Annotation-based Metadata for Services Configuration
  • (c) Dependency Injection
  • (d) Extensible Container Services

Reference:

https://docs.jboss.org/books/lightweight/ch01.html

JSON parsing methods

1) Use JSONNode with ObjectMapper using mapper.readTree(jsonString)

  • eg: String jsonString = “{“name”:”superman”,”age”:”25″}”;

ObjectMapper mapper = new ObjectMapper();
JsonNode rootNode = mapper.readTree(jsonString);
JsonNode jsonNode1 = rootNode.get(“name”);

example reading from json file:

InputStream inputStream = ExampleStructure.class.
.getResourceAsStream(“example.json”);
JsonNode rootNode = mapper.readTree(inputStream);

//usage of path() in addition to usage of get()

rootNode.path(“address”).path(“country”).textValue();

More detailed example:

2) Use ObjectMapper and mapper.readValue(source_json, destination_pojo);

Screen Shot 2019-01-15 at 12.35.56 AM.png

Screen Shot 2019-01-15 at 12.37.11 AM.png

3) Use JSONObject(source_json) and destination is a  jsonObject.getJsonArray(“nodename”) etc

Screen Shot 2019-01-15 at 12.39.56 AM.png

FAQ )
So
readTree() for JSONNode and
readValue() for POJO?

No, you can use readValue() for JSONNode as well:

ie: JsonNode jn = new ObjectMapper().readValue(src, JsonNode.class);

Additional properties

objectMapper.configure(
    DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
objectMapper.configure(DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES, true);

Java 8 lambdas usage:

https://stackoverflow.com/questions/38662582/update-json-values-with-lambda

JsonNode json = new ObjectMapper().readTree(new ObjectMapper().writeValueAsString(myObject));

ObjectNode rootNode = (ObjectNode) json;
ObjectNode resourceNode = (ObjectNode) rootNode.path("resource");
resourceNode.fields().forEachRemaining(
    entry -> resourceNode.set(
        entry.getKey(),
        func(entry.getValue())
    )
);

Java 8 Streams usage:

StreamSupport.stream(jn.spliterator(), false /* or whatever */);

Oracle select query with inner select query error-ORA-00907: missing right parenthesis

usecase:

/ this gets executed

SELECT empid FROM employees WHERE deptid IN (10,20,30,40 );

// this gets executed

SELECT deptid FROM department WHERE description LIKE '%application%' 
  ORDER BY createddate DESC 

but the below query throws error:

SELECT empid  FROM employees WHERE deptid IN (SELECT deptid FROM department WHERE description LIKE '%application%' 
  ORDER BY createddate DESC);

 

Solution

The problem is placing ORDER BY in the WHERE clause subquery. SQL syntax does not allow you to order elements of the subquery in the WHERE clause, because it does not change the result of the query overall

This article well explains many of the concepts – http://oraclequirks.blogspot.com/2008/01/ora-00907-missing-right-parenthesis.html

“ORA-00907: missing right parenthesis Clearly when one gets a message like this, the first reaction is probably to verify what parenthesis has been left out, but unfortunately there are no missing parentheses at all in this statement.

To cut it short, the untold syntax quirk is summarized as follows: don’t use ORDER BY inside an IN subquery.

Now, one may object that indeed it doesn’t make sense to use the ORDER BY inside an IN clause, which is true, because Oracle doesn’t care about the row order inside an IN clause:”

I tried the SQL statement with WHERE clause and ‘=’ instead of ‘IN’ and it still threw the error:’missing right parenthesis‘.

conclusion 1 :

“Don’t use ORDER BY in the WHERE clause subquery” or “Subqueries in the where clause are not allowed to use ORDER BY in Oracle”

Conclusion 2

This case-study also shows the scenario where we should go for JOIN rather than select subquery

maven install giving compile error complaining java lower version

<!– If your application uses Java 8 code, then the below two properties are required to tell maven use Java1.8 , otherwise
you gets compilation error as it uses default java 1.5 –>

<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>

linkedlist-collections.sort()

        LinkedList

  • LinkedList implements the java.util.List interface, so you can sort the LinkedList by using Collections.sort() method.
  • Since LinkedList class implements the linked list data structure which doesn’t provide random access based upon the index, sorting is quite expensive.
  • In order to access any element, you need to first traverse through that element which is O(n). 

    How Collections.sort() does sorting to avoid O(n) complexity?

  • Collections.sort() method uses an efficient strategy to handle this scenario.
  • It first copies the contents of LinkedList to an array, sorts the array and copies it back.
  • So it’s as efficient as sorting an ArrayList.

    About Collections.sort() ‘s  natural sorting/custom sorting ?


  • By default Collections.sort() arrange elements of linked list into their natural order of sorting
  • But it also accepts a Comparator, which can be used to sort elements in custom order.
  • Java 8 also introduced a new sort() method on the java.util.List interface itself, which means you no longer need Collections.sort() to sort a LinkedList, you can do directly by calling the LinkedList.sort() method in Java 8

 

Which is preferred? – Primitive datatypes or Autoboxing(Java primitives vs Objects comparison)?

Primitive datatypes and Autoboxing

  • Avoid creating objects(wrapper classes) unnecessary.
  • Prefer primitives.
  • Boxed types have poor performance and memory issues. It may also lead to NPE’s.
  • Read this for Java primitives vs Objects comparison and measuring their performance – https://www.baeldung.com/java-primitives-vs-objects

Order of execution of Static , Initializer and Constructor in a class with a parent class

What is the order of execution of Static , Initializer and Constructor in a class with a parent class eg: Class A extends Class B

Always remember the rule – “SIC” – Static, Initializer and then Constructor.

  1.  Parent class is loaded first
  2.  Static block of Parent
  3. Child class is loaded 
  4. Static variables/block of Child
  5. Initializer of Parent
  6. Constructor of Parent
  7.  Initializer of Child
  8. Constructor of Child

 

print something on console if there is no PSVM method

If your class doesn’t have a PSVM (main method), is there a way to print something on console?

Yes, use Static blocks.

  •  Static blocks is mostly used for changing the default values of static variables.
  • A static block gets executed when the class is loaded into memory
  •  A class can have multiple static blocks which will be executed in the order in which they are written.
Olderposts

Copyright © 2020 Codinko- Java Coding Tutorials

Theme by Anders NorenUp ↑