Apache Oozie. The Workflow Scheduler for Hadoop

- Autorzy:
- Mohammad Kamrul Islam, Aravind Srinivasan
- Promocja Przejdź


- Ocena:
- Bądź pierwszym, który oceni tę książkę
- Stron:
- 272
- Dostępne formaty:
-
ePubMobi
Opis książki: Apache Oozie. The Workflow Scheduler for Hadoop
Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases.
Once you set up your Oozie server, you’ll dive into techniques for writing and coordinating workflows, and learn how to write complex data pipelines. Advanced topics show you how to handle shared libraries in Oozie, as well as how to implement and manage Oozie’s security capabilities.
- Install and configure an Oozie server, and get an overview of basic concepts
- Journey through the world of writing and configuring workflows
- Learn how the Oozie coordinator schedules and executes workflows based on triggers
- Understand how Oozie manages data dependencies
- Use Oozie bundles to package several coordinator apps into a data pipeline
- Learn about security features and shared library management
- Implement custom extensions and write your own EL functions and actions
- Debug workflows and manage Oozie’s operational details
Wybrane bestsellery
-
Data is bigger, arrives faster, and comes in a variety of formatsâ??and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark.Updated to include Spark 3.0, this second edition shows data engineer...(0,00 zł najniższa cena z 30 dni)
237.15 zł
279.00 zł(-15%) -
Before you can build analytics tools to gain quick insights, you first need to know how to process data in real time. With this practical guide, developers familiar with Apache Spark will learn how to put this in-memory framework to use for streaming data. You’ll discover how Spark enables ...
Stream Processing with Apache Spark. Mastering Structured Streaming and Spark Streaming Stream Processing with Apache Spark. Mastering Structured Streaming and Spark Streaming
(0,00 zł najniższa cena z 30 dni)237.15 zł
279.00 zł(-15%) -
Every enterprise application creates data, including log messages, metrics, user activity, and outgoing messages. Learning how to move these items is almost as important as the data itself. If you're an application architect, developer, or production engineer new to Apache Pulsar, this practical ...(0,00 zł najniższa cena z 30 dni)
237.15 zł
279.00 zł(-15%) -
Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizati...
High Performance Spark. Best Practices for Scaling and Optimizing Apache Spark High Performance Spark. Best Practices for Scaling and Optimizing Apache Spark
(0,00 zł najniższa cena z 30 dni)160.65 zł
189.00 zł(-15%) -
Niniejszy leksykon jest świetną pozycją dla tych osób, które miały już styczność z serwerem Apache lub chcą szybko opanować konkretne zagadnienia związane z jego konfiguracją. Dzięki tej książce poznasz wszystkie dyrektywy, które mogą być przydatne w codziennej pracy. Dowiesz się, jak skonfigurow...(0,00 zł najniższa cena z 30 dni)
12.45 zł
24.90 zł(-50%) -
Dzięki książce Apache. Receptury zapoznasz się z gotowymi przepisami na rozwiązanie ciekawych, specyficznych oraz intrygujących problemów. Nauczysz się instalować serwer z różnych źródeł oraz na różnych platformach. Dowiesz się, w jaki sposób zwiększyć jego bezpieczeństwo, jak uruchomić serwery w...(0,00 zł najniższa cena z 30 dni)
24.50 zł
49.00 zł(-50%) -
Apache is far and away the most widely used web server platform in the world. This versatile server runs more than half of the world's existing web sites. Apache is both free and rock-solid, running more than 21 million web sites ranging from huge e-commerce operations to corporate intranets and ...
Apache: The Definitive Guide. The Definitive Guide, 3rd Edition. 3rd Edition Apache: The Definitive Guide. The Definitive Guide, 3rd Edition. 3rd Edition
(0,00 zł najniższa cena z 30 dni)169.15 zł
199.00 zł(-15%) -
Implement, run, operate, and test data processing pipelines using Apache Beam
-
Serverless computing greatly simplifies software development. Your team can focus solely on your application while the cloud provider manages the servers you need. This practical guide shows you step-by-step how to build and deploy complex applications in a flexible multicloud, multilanguage envi...
Learning Apache OpenWhisk. Developing Open Serverless Solutions Learning Apache OpenWhisk. Developing Open Serverless Solutions
(0,00 zł najniższa cena z 30 dni)237.15 zł
279.00 zł(-15%) -
More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you’ll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm.Authors Ted Dunning and Ellen Friedman (Real World Had...
Streaming Architecture. New Designs Using Apache Kafka and MapR Streams Streaming Architecture. New Designs Using Apache Kafka and MapR Streams
(0,00 zł najniższa cena z 30 dni)84.92 zł
99.90 zł(-15%)
Ebooka przeczytasz na:
-
czytnikach Inkbook, Kindle, Pocketbook i innych
-
systemach Windows, MacOS i innych
-
systemach Windows, Android, iOS, HarmonyOS
-
na dowolnych urządzeniach i aplikacjach obsługujących formaty: PDF, EPub, Mobi
Masz pytania? Zajrzyj do zakładki Pomoc »
Audiobooka posłuchasz:
-
w aplikacji Ebookpoint na Android, iOS, HarmonyOs
-
na systemach Windows, MacOS i innych
-
na dowolonych urządzeniach i aplikacjach obsługujących format MP3 (pliki spakowane w ZIP)
Masz pytania? Zajrzyj do zakładki Pomoc »
Kurs Video zobaczysz:
-
w aplikacji Ebookpoint na Android, iOS, HarmonyOs
-
na systemach Windows, MacOS i innych
-
na dowolonych urządzeniach i aplikacjach obsługujących format MP4 (pliki spakowane w ZIP)
Szczegóły książki
- ISBN Ebooka:
- 978-14-493-6975-0, 9781449369750
- Data wydania ebooka:
-
2015-05-12
Data wydania ebooka często jest dniem wprowadzenia tytułu do sprzedaży i może nie być równoznaczna z datą wydania książki papierowej. Dodatkowe informacje możesz znaleźć w darmowym fragmencie. Jeśli masz wątpliwości skontaktuj się z nami sklep@helion.pl.
- Język publikacji:
- angielski
- Rozmiar pliku ePub:
- 6.2MB
- Rozmiar pliku Mobi:
- 6.2MB
- Kategorie:
Serwery internetowe » Apache
Spis treści książki
- Foreword
- Preface
- Contents of This Book
- Conventions Used in This Book
- Using Code Examples
- Safari Books Online
- How to Contact Us
- Acknowledgments
- 1. Introduction to Oozie
- Big Data Processing
- A Recurrent Problem
- A Common Solution: Oozie
- Oozies role in the Hadoop Ecosystem
- What exactly is Oozie?
- The name Oozie
- A Simple Oozie Job
- Oozie Releases
- Timeline and status of the releases
- Compatibility
- Big Data Processing
- Some Oozie Usage Numbers
- 2. Oozie Concepts
- Oozie Applications
- Oozie Workflows
- Workflow use case
- Oozie Workflows
- Oozie Coordinators
- Coordinator use case
- Oozie Applications
- Oozie Bundles
- Bundle use case
- Parameters, Variables, and Functions
- Application Deployment Model
- Oozie Architecture
- 3. Setting Up Oozie
- Oozie Deployment
- Basic Installations
- Requirements
- Build Oozie
- Install Oozie Server
- Hadoop Cluster
- Hadoop installation
- Configuring Hadoop for Oozie
- Hadoop installation
- Start and Verify the Oozie Server
- Advanced Oozie Installations
- Configuring Kerberos Security
- DB Setup
- MySQL configuration
- Oracle configuration
- Shared Library Installation
- Sharelib since version 4.1.0
- Oozie Client Installations
- 4. Oozie Workflow Actions
- Workflow
- Actions
- Action Execution Model
- Action Definition
- Action Types
- MapReduce Action
- Streaming
- Pipes
- MapReduce example
- Streaming example
- MapReduce Action
- Java Action
- Java example
- Pig Action
- Pig example
- FS Action
- Filesystem example
- Sub-Workflow Action
- Hive Action
- Hive example
- DistCp Action
- DistCp Example
- Email Action
- Shell Action
- Shell example
- SSH Action
- Sqoop Action
- Sqoop example
- Synchronous Versus Asynchronous Actions
- 5. Workflow Applications
- Outline of a Basic Workflow
- Control Nodes
- <start> and <end>
- <fork> and <join>
- <decision>
- <kill>
- <OK> and <ERROR>
- Job Configuration
- Global Configuration
- Job XML
- Inline Configuration
- Launcher Configuration
- Parameterization
- EL Variables
- EL constants and system-defined variables
- Hadoop counters
- EL Variables
- EL Functions
- String timestamp()
- String wf:id()
- String wf:errorCode(String node)
- boolean fs:fileSize(String path)
- EL Expressions
- The job.properties File
- Command-Line Option
- The config-default.xml File
- The <parameters> Section
- Configuration and Parameterization Examples
- Lifecycle of a Workflow
- Action States
- 6. Oozie Coordinator
- Coordinator Concept
- Triggering Mechanism
- Time Trigger
- Data Availability Trigger
- Coordinator Application and Job
- Coordinator Action
- Our First Coordinator Job
- Coordinator Submission
- Oozie Web Interface for Coordinator Jobs
- Coordinator Job Lifecycle
- Coordinator Action Lifecycle
- Parameterization of the Coordinator
- EL Functions for Frequency
- Day-Based Frequency
- Month-Based Frequency
- Execution Controls
- An Improved Coordinator
- 7. Data Trigger Coordinator
- Expressing Data Dependency
- Dataset
- Defining a dataset
- Timelines: coordinator versus dataset
- input-events
- output-events
- Dataset
- Expressing Data Dependency
- Example: Rollup
- Parameterization of Dataset Instances
- current(n)
- latest(n)
- Comparison of current() and latest()
- Parameter Passing to Workflow
- dataIn(eventName):
- dataOut(eventName)
- nominalTime()
- actualTime()
- dateOffset(baseTimeStamp, skipInstance, timeUnit)
- formatTime(timeStamp, formatString)
- A Complete Coordinator Application
- 8. Oozie Bundles
- Bundle Basics
- Bundle Definition
- Why Do We Need Bundles?
- Bundle Basics
- Bundle Specification
- Execution Controls
- Bundle State Transitions
- 9. Advanced Topics
- Managing Libraries in Oozie
- Origin of JARs in Oozie
- Design Challenges
- Managing Action JARs
- How to get the JARs?
- Installing sharelib
- Overriding/upgrading existing JARs
- Supporting multiple versions
- Supporting the Users JAR
- JAR Precedence in classpath
- Managing Libraries in Oozie
- Oozie Security
- Oozie Security Overview
- Oozie to Hadoop
- Configuring Hadoop services
- Setting up Keytab and Principal
- Configuring the Oozie server
- Oozie Client to Server
- Oozie Server Security
- Configuring the Oozie Server
- Oozie client
- Proxy user in Oozie
- Supporting Custom Credentials
- Supporting New API in MapReduce Action
- Supporting Uber JAR
- Cron Scheduling
- A Simple Cron-Based Coordinator
- Oozie Cron Specification
- Allowed values
- Special characters
- Nonstandard special characters
- Emulate Asynchronous Data Processing
- HCatalog-Based Data Dependency
- 10. Developer Topics
- Developing Custom EL Functions
- Requirements for a New EL Function
- Implementing a New EL Function
- Writing a new EL function
- Deploy the new EL function
- Using the new function
- Developing Custom EL Functions
- Supporting Custom Action Types
- Creating a Custom Synchronous Action
- Writing an ActionExecutor
- Writing the XML schema
- Deploying the new action type
- Using the new action type
- Creating a Custom Synchronous Action
- Overriding an Asynchronous Action Type
- Implementing the New ActionMain Class
- Testing the New Main Class
- Creating a New Asynchronous Action
- Writing an Asynchronous Action Executor
- Writing the ActionMain Class
- Writing Actions Schema
- Deploying the New Action Type
- Using the New Action Type
- 11. Oozie Operations
- Oozie CLI Tool
- CLI Subcommands
- Useful CLI Commands
- The validate subcommand
- The job subcommand
- The jobs subcommand
- More subcommands
- Oozie CLI Tool
- Oozie REST API
- Oozie Java Client
- The oozie-site.xml File
- The Oozie Purge Service
- Job Monitoring
- JMS-Based Monitoring
- Installation and configuration
- Consuming JMS messages
- JMS-Based Monitoring
- Oozie Instrumentation and Metrics
- Reprocessing
- Workflow Reprocessing
- Coordinator Reprocessing
- Bundle Reprocessing
- Server Tuning
- JVM Tuning
- Service Settings
- The CallableQueueService
- The RecoveryService
- Oozie High Availability
- Debugging in Oozie
- Oozie Logs
- Developing and Testing Oozie Applications
- Application Deployment Tips
- Common Errors and Debugging
- MiniOozie and LocalOozie
- The Competition
- Index
O'Reilly Media - inne książki
-
FinOps brings financial accountability to the variable spend model of cloud. Used by the majority of global enterprises, this management practice has grown from a fringe activity to the de facto discipline managing cloud spend. In this book, authors J.R. Storment and Mike Fuller outline the proce...(0,00 zł najniższa cena z 30 dni)
262.65 zł
309.00 zł(-15%) -
Edge AI is transforming the way computers interact with the real world, allowing IoT devices to make decisions using the 99% of sensor data that was previously discarded due to cost, bandwidth, or power limitations. With techniques like embedded machine learning, developers can capture human intu...(0,00 zł najniższa cena z 30 dni)
262.65 zł
309.00 zł(-15%) -
Why is it difficult for so many companies to get digital identity right? If you're still wrestling with even simple identity problems like modern website authentication, this practical book has the answers you need. Author Phil Windley provides conceptual frameworks to help you make sense of all ...(0,00 zł najniższa cena z 30 dni)
186.15 zł
219.00 zł(-15%) -
Python was recently ranked as today's most popular programming language on the TIOBE index, thanks to its broad applicability to design and prototyping to testing, deployment, and maintenance. With this updated fourth edition, you'll learn how to get the most out of Python, whether you're a profe...(0,00 zł najniższa cena z 30 dni)
305.15 zł
359.00 zł(-15%) -
With the accelerating speed of business and the increasing dependence on technology, companies today are significantly changing the way they build in-house business solutions. Many now use low-code and no code technologies to help them deal with specific issues, but that's just the beginning. Wit...
Building Solutions with the Microsoft Power Platform Building Solutions with the Microsoft Power Platform
(0,00 zł najniższa cena z 30 dni)262.65 zł
309.00 zł(-15%) -
Companies are scrambling to integrate AI into their systems and operations. But to build truly successful solutions, you need a firm grasp of the underlying mathematics. This accessible guide walks you through the math necessary to thrive in the AI field such as focusing on real-world application...(0,00 zł najniższa cena z 30 dni)
262.65 zł
309.00 zł(-15%) -
DevOps engineers, developers, and security engineers have ever-changing roles to play in today's cloud native world. In order to build secure and resilient applications, you have to be equipped with security knowledge. Enter security as code.In this book, authors BK Sarthak Das and Virginia Chu d...(0,00 zł najniższa cena z 30 dni)
186.15 zł
219.00 zł(-15%) -
With the increasing use of AI in high-stakes domains such as medicine, law, and defense, organizations spend a lot of time and money to make ML models trustworthy. Many books on the subject offer deep dives into theories and concepts. This guide provides a practical starting point to help develop...(0,00 zł najniższa cena z 30 dni)
262.65 zł
309.00 zł(-15%) -
Why are so many companies adopting GitOps for their DevOps and cloud native strategy? This reliable framework is quickly becoming the standard method for deploying apps to Kubernetes. With this practical, developer-oriented book, DevOps engineers, developers, IT architects, and SREs will learn th...(0,00 zł najniższa cena z 30 dni)
262.65 zł
309.00 zł(-15%) -
Learn the essentials of working with Flutter and Dart to build full stack applications that meet the needs of a cloud-driven world. Together, the Flutter open source UI software development kit and the Dart programming language for client development provide a unified solution to building applica...(0,00 zł najniższa cena z 30 dni)
220.15 zł
259.00 zł(-15%)
Dzieki opcji "Druk na żądanie" do sprzedaży wracają tytuły Grupy Helion, które cieszyły sie dużym zainteresowaniem, a których nakład został wyprzedany.
Dla naszych Czytelników wydrukowaliśmy dodatkową pulę egzemplarzy w technice druku cyfrowego.
Co powinieneś wiedzieć o usłudze "Druk na żądanie":
- usługa obejmuje tylko widoczną poniżej listę tytułów, którą na bieżąco aktualizujemy;
- cena książki może być wyższa od początkowej ceny detalicznej, co jest spowodowane kosztami druku cyfrowego (wyższymi niż koszty tradycyjnego druku offsetowego). Obowiązująca cena jest zawsze podawana na stronie WWW książki;
- zawartość książki wraz z dodatkami (płyta CD, DVD) odpowiada jej pierwotnemu wydaniu i jest w pełni komplementarna;
- usługa nie obejmuje książek w kolorze.
W przypadku usługi "Druk na żądanie" termin dostarczenia przesyłki może obejmować także czas potrzebny na dodruk (do 10 dni roboczych)
Masz pytanie o konkretny tytuł? Napisz do nas: sklep[at]helion.pl.
Książka, którą chcesz zamówić pochodzi z końcówki nakładu. Oznacza to, że mogą się pojawić drobne defekty (otarcia, rysy, zagięcia).
Co powinieneś wiedzieć o usłudze "Końcówka nakładu":
- usługa obejmuje tylko książki oznaczone tagiem "Końcówka nakładu";
- wady o których mowa powyżej nie podlegają reklamacji;
Masz pytanie o konkretny tytuł? Napisz do nas: sklep[at]helion.pl.


Oceny i opinie klientów: Apache Oozie. The Workflow Scheduler for Hadoop Mohammad Kamrul Islam, Aravind Srinivasan (0)
Weryfikacja opinii następuję na podstawie historii zamówień na koncie Użytkownika umieszczającego opinię. Użytkownik mógł otrzymać punkty za opublikowanie opinii uprawniające do uzyskania rabatu w ramach Programu Punktowego.