Skip to content
Metalop.comJust Another Blog about Technology
  • Technology
  • Programming
  • Other Stuff

Securing Hadoop Big Data Landscape with Apache Knox Gateway and Keycloak: Part 4 (Configuring Knox to Authenticate with Keycloak)

July 10, 2019 2 comments Article Technology Shiva

Prerequisites

As both Keycloak and Knox are Java based application, the only prerequisite is to have Java installed and JAVA_HOME environment variable set . I have tested the application with Java 8.

Setting up Keycloak

Getting started with Keycloak is as easy as downloading the latest version of Keycloak Server from the download section of the web site and navigating to the “bin” directory and running standalone.bat or standalone.sh file

You can then access keycloak at localhost:8080, The first time you will be required to create and initial password for the admin realm. Once you are done with it, its pretty much it. You can play around with it after logging to the admin realm.

Setting up apache knox

Download the latest Gateway Server Binary from the Knox release website. For this post the configuration is based on version 1.2.0 of the gateway.

Start LDAP embedded in Knox

Knox comes with an LDAP server for demonstration purposes. Start it using the following command

1
2
cd {GATEWAY_HOME}
bin/ldap.sh start

Create the Master Secret

Run the “knox-cli.sh”  command in order to persist the master secret that is used to protect the key and credential stores for the gateway instance.

1
2
cd {GATEWAY_HOME}
bin/knoxcli.sh create-master

Start Knox

Before you start the Knox instance, navigate to “conf” and change the property “gateway.port” to 18443 (or something else) in gateway-site.xml as we alread have Keycloak running on 8080 and 8443. Then you can start a knox instance with the following command

1
2
cd {GATEWAY_HOME}
bin/gateway.sh start

Once you are done to stop the knox instance use

1
2
cd {GATEWAY_HOME}
bin/gateway.sh stop

Now when we are up and running we will configure SAML auth in Knox. Its a three part process.

1) Configure Keycloak as Identity provider in Knox

Copy “conf/topologies/knoxsso.xml” to “conf/topologies/keycloak.xml”. Now edit it and delete the “ShiroProvider” provider and add the following provider Pac4j provider instead

Full configuration will look like this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
<topology>
   <gateway>
     <provider>
         <role>federation</role>
         <name>pac4j</name>
         <enabled>true</enabled>
         <param>
           <name>pac4j.callbackUrl</name>
<value>https://localhost:18443/gateway/keycloak/api/v1/websso</value>
         </param>
 
         <param>
           <name>clientName</name>
           <value>SAML2Client</value>
         </param>
 
         <param>
           <name>saml.identityProviderMetadataPath</name>
           <value>http://localhost:8080/auth/realms/master/protocol/saml/descriptor</value>
         </param>
 
         <param>
           <name>saml.serviceProviderMetadataPath</name>
           <value>./sp-metadata.xml</value>
         </param>
 
         <param>
           <name>saml.serviceProviderEntityId</name>
           <value>https://localhost:18443/gateway/keycloak/api/v1/websso?pac4jCallback=true&amp;client_name=SAML2Client</value>
         </param>
     </provider>
    
   </gateway>
 
   <service>
       <role>KNOXSSO</role>
       <param>
         <name>knoxsso.cookie.secure.only</name>
         <value>true</value>
      </param>
      <param>
        <name>knoxsso.token.ttl</name>
        <value>100000</value>
      </param>
      <param>
         <name>knoxsso.redirect.whitelist.regex</name>
         <value>^https?:\/\/(www\.local\.com|localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1):[0-9].*$</value>
      </param>
   </service>
</topology>

Note: There is an “saml.serviceProviderMetadataPath” parameter in the config. This is the service provider metadata path. Knox will generate this the first time you access the service.

Navigate to ” https://localhost:18443/gateway/keycloak/api/v1/websso” and the file will be generated in you /bin folder.

2) Configure Knox as a service provider in Keycloak

Now in the second part we need to configure Keycloak to Authenticate the requests coming from Knox. To achieve this login to Keycloak admin panel and navigate to Clients and click create.

On the next screen import the sp-metadata.xml file and click save

3) Secure a topology using the “SSOCookieProvider” provider

In this step we will secure our Topoloy to authenticate with our SSO service to achieve this create a topology which is secured using a cookie issued by Knox SSO. Copy “conf/topologies/sandbox.xml” to “conf/topologies/sandbox-sso.xml” . The final contents of the file will look like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
<?xml version="1.0" encoding="utf-8"?>
<topology>
 
    <gateway>
 
        <provider>
            <role>federation</role>
            <name>SSOCookieProvider</name>
            <enabled>true</enabled>
            <param>
                <name>sso.authentication.provider.url</name>
                <value>https://localhost:18443/gateway/keycloak/api/v1/websso</value>
            </param>
        </provider>
 
    </gateway>
 
    <service>
        <role>NAMENODE</role>
        <url>hdfs://localhost:8020</url>
    </service>
 
    <service>
        <role>JOBTRACKER</role>
        <url>rpc://localhost:8050</url>
    </service>
 
    <service>
        <role>WEBHDFS</role>
        <url>http://localhost:50070/webhdfs</url>
    </service>
 
    <service>
        <role>WEBHCAT</role>
        <url>http://localhost:50111/templeton</url>
    </service>
 
    <service>
        <role>OOZIE</role>
        <url>http://localhost:11000/oozie</url>
        <param>
            <name>replayBufferSize</name>
            <value>8</value>
        </param>
    </service>
 
    <service>
        <role>WEBHBASE</role>
        <url>http://localhost:60080</url>
        <param>
            <name>replayBufferSize</name>
            <value>8</value>
        </param>
    </service>
 
    <service>
        <role>HIVE</role>
        <url>http://localhost:10001/cliservice</url>
        <param>
            <name>replayBufferSize</name>
            <value>8</value>
        </param>
    </service>
 
    <service>
        <role>RESOURCEMANAGER</role>
        <url>http://localhost:8088/ws</url>
    </service>
 
    <service>
        <role>DRUID-COORDINATOR-UI</role>
        <url>http://localhost:8081</url>
    </service>
 
    <service>
        <role>DRUID-COORDINATOR</role>
        <url>http://localhost:8081</url>
    </service>
 
    <service>
        <role>DRUID-BROKER</role>
        <url>http://localhost:8082</url>
    </service>
 
    <service>
        <role>DRUID-ROUTER</role>
        <url>http://localhost:8082</url>
    </service>
    
    <service>
        <role>DRUID-OVERLORD</role>
        <url>http://localhost:8090</url>
    </service>
 
    <service>
        <role>DRUID-OVERLORD-UI</role>
        <url>http://localhost:8090</url>
    </service>
 
</topology>

Add individual services that you want to be protected in the <service> section.

To check the configuration you can navigate to https://localhost:18443/gateway/sandbox-sso/webhdfs/v1/?op=LISTSTATUS

Tags: big data, hadoop, Keycloak, security

2 comments

  • Digvijay Sawant June 24, 2020 at 11:54 pm Reply

    Hello Shiva Sir,
    Your Blog is very useful for securing the Hadoop cluster using KeyClock.
    I followed your steps, but I am facing some issue
    1] After authenticating from keycloak , URL is not redirecting to my Hadoop cluster giving ERROR 500 -Problem accessing /gateway/keycloak/api/v1/websso Request Failed.
    Is this issue of Redirecting URL via Knox to Hadoop Cluster?
    How to give redirecting URL for other topologies(What should change done in .xml file) for Hadoop Cluster?
    Thanks for Your Blog

  • Shiva July 19, 2020 at 2:07 am Reply

    Hi Digvijay,

    This could be because the keycloak is not running. Please check if you are able to access /gateway/keycloak/api/v1/websso from the browser or cheking the error in the logs will also help.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Calendar

July 2019
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
293031  
« Jun    

Archives

  • July 2019
  • June 2019
  • October 2017
  • August 2016
  • April 2016
  • December 2015
  • August 2015
  • July 2015
  • June 2015
  • June 2014
  • October 2013
  • August 2013
  • March 2013
  • February 2013

Categories

  • Other Stuff
  • Programming
  • Technology
  • Uncategorized

Categories

  • Other Stuff
  • Programming
  • Technology
  • Uncategorized

No Rights Reserved. | Theme by ThemeinProgress | Proudly powered by WordPress