Apache Hive-Orien IT - PowerPoint PPT Presentation

About This Presentation
Title:

Apache Hive-Orien IT

Description:

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query, and analysis. – PowerPoint PPT presentation

Number of Views:147

less

Transcript and Presenter's Notes

Title: Apache Hive-Orien IT


1

http//www.orienit.com/
2
Database Vs Data Warehouse
  • Historical Data is the back bone of any
    business for mission critical business decisions.
    Data is stored in some form of tables in the
    database.So why the Business Intelligence systems
    are using Data Warehouse rather than Database to
    pull historical data?

What is the difference between Database and DataWa
rehouse while both of them have some tables with
Data, Index and key etc.?  Here are Differences.
http//www.orienit.com/
3
DATA BASE
  • Used for Online Transactional Processing (OLTP).
    This records the data from the user for history.
  • The tables and joins are complex since they are
    normalized. This is done to reduce redundant data
    and to save storage space.
  • Entity Relational modeling techniques are used
    for database design.
  • Optimized for write operation.
  • Performance is low for analysis queries. 

http//www.orienit.com/
4
DATA WAREHOUSE
  • Used for Online Analytical Processing (OLAP).
    This reads the historical data for the
    Users  for business decisions. 
  •  The Tables and joins are simple since they are
    de-normalized. This is done to reduce
    the response time for analytical queries. 
  • Data Modeling techniques are used for the Data
    Warehouse design.
  • Optimized for read operations. 
  • High performance for analytical queries.

http//www.orienit.com/
5
General Data Flow  (Ex Online Insurance
Registration)
  • Customer enters the details in the Online
    Registration form.
  • The details are saved into the Database when the
    customer presses the Submit button in the form.
  • Business Intelligence Team of the Insurance
    Company uses an ETL tool to pull the data from
    the Database tables to the Data Warehouse tables.
  • Business Management uses Business Reporting
    Tools to pull Data from Data Warehouse tables for
    generating business reports.

http//www.orienit.com/
6
STRING FUNCTIONS IN HIVE
The string functions in Hive are listed below
1.ASCII( string str )
  • The ASCII function converts the first character
    of the string into its numeric ascii value.

Example1 ASCII('hadoop') returns 104
Example2 ASCII('A') returns 65
2.CONCAT( string str1, string str2... )
  • The CONCAT function concatenates all the stings.

Example CONCAT('hadoop','-','hive') returns
'hadoop-hive'
http//www.orienit.com/
7
3. CONCAT_WS( string delimiter, string str1,
string str2... )
  • The CONCAT_WS function is similar to the CONCAT
    function. Here you can also provide the
    delimiter, which can be used in between the
    strings to concat.

Example CONCAT_WS('-','hadoop','hive')
returns 'hadoop-hive'
4. FIND_IN_SET( string search string, string
source_string_list )
  • The FIND_IN_SET function searches for the
    search string in the source_string_list and
    returns the position of the first occurrence in
    the source string list. Here the source string
    list should be comma delimited one. It returns 0
    if the first argument contains comma.

Example FIND_IN_SET('ha','hao,mn,hc,ha,hef')
returns 4
http//www.orienit.com/
8
5. LENGTH( string str )
  • The LENGTH function returns the number of
    characters in a string.

Example LENGTH('hive') returns 4
6. LOWER( string str ),  LCASE( string str )
  • The LOWER or LCASE function converts the string
    into lower case letters

Example LOWER('HiVe') returns 'hive'
7. LPAD( string str, int len, string pad )
  • The LPAD function returns the string with a
    length of len characters left-padded with pad.

Example LPAD('hive',6,'v') returns 'vvhive'
http//www.orienit.com/
9
8. LTRIM( string str )
  • The LTRIM function removes all the trailing
    spaces from the string.

Example LTRIM('   hive') returns 'hive'
9. REPEAT( string str, int n ) 
  • The REPEAT function repeats the specified string
    n times.

Example REPEAT('hive',2) returns 'hivehive'
10.RPAD( string str, int len, string pad )
  • The RPAD function returns the string with a
    length of len characters right-padded with pad.

Example RPAD('hive',6,'v') returns 'hivevv'
http//www.orienit.com/
10
11. REVERSE( string str )
  • The REVERSE function gives the reversed
    string.

Example REVERSE('hive') returns 'evih'
12. RTRIM( string str )
  • The RTRIM function removes all the leading
    spaces from the string.

Example LTRIM('hive   ') returns 'hive'
13. SPACE( int number_of_spaces )
  • The SPACE function returns the specified number
    of spaces.

Example SPACE(4) returns '    '
14. SPLIT( string str, string pat )  
  • The SPLIT function splits the string around the
    pattern pat and returns an array of strings. You
    can specify regular expressions as patterns.

Example SPLIT('hivehadoop','') returns
"hive","hadoop"
http//www.orienit.com/
11
15. SUBSTR( string source_str, int
start_position ,int length  ),  SUBSTRING(
string source_str, int start_position ,int
length  ) 
  • The SUBSTR or SUBSTRING function returns a
    part of the source string from the start position
    with the specified length of characters. If the
    length is not given, then it returns from the
    start position to the end of the string.

Example1 SUBSTR('hadoop',4) returns 'oop'
Example2 SUBSTR('hadoop',4,2) returns 'oo'
16.TRIM( string str )
  • The TRIM function removes both the trailing and
    leading spaces from the string.

Example LTRIM('   hive   ') returns 'hive'
http//www.orienit.com/
12
17.UPPER( string str ), UCASE( string str )
  • The UPPER or LCASE function converts the string
    into upper case letters.

Example UPPER('HiVe') returns 'HIVE'
For More Information
http//www.kalyanhadooptraining.com/
http//www.orienit.com/
13
Address Flat no 204, 2nd floor, Annapurna
Block, Aditya Enclave, Ameerpet,
Hyderabad-16. Contact us 040 6514 2345, 91
970 320 2345
Email ID info_at_OrienIT.com
http//www.orienit.com/
Write a Comment
User Comments (0)
About PowerShow.com