Skip to content
Metalop.comJust Another Blog about Technology
  • Technology
  • Programming
  • Other Stuff

Securing Hadoop Big Data Landscape with Apache Knox Gateway and Keycloak: Part 4 (Configuring Knox to Authenticate with Keycloak)

July 10, 2019 2 comments Article Technology Shiva

Prerequisites

As both Keycloak and Knox are Java based application, the only prerequisite is to have Java installed and JAVA_HOME environment variable set . I have tested the application with Java 8.

Setting up Keycloak

Getting started with Keycloak is as easy as downloading the latest version of Keycloak Server from the download section of the web site and navigating to the “bin” directory and running standalone.bat or standalone.sh file

You can then access keycloak at localhost:8080, The first time you will be required to create and initial password for the admin realm. Once you are done with it, its pretty much it. You can play around with it after logging to the admin realm.

Setting up apache knox

Download the latest Gateway Server Binary from the Knox release website. For this post the configuration is based on version 1.2.0 of the gateway.

Start LDAP embedded in Knox

Knox comes with an LDAP server for demonstration purposes. Start it using the following command

cd {GATEWAY_HOME} 
bin/ldap.sh start

Create the Master Secret

Run the “knox-cli.sh”  command in order to persist the master secret that is used to protect the key and credential stores for the gateway instance.

cd {GATEWAY_HOME}
bin/knoxcli.sh create-master

Start Knox

Before you start the Knox instance, navigate to “conf” and change the property “gateway.port” to 18443 (or something else) in gateway-site.xml as we alread have Keycloak running on 8080 and 8443. Then you can start a knox instance with the following command

cd {GATEWAY_HOME}
bin/gateway.sh start

Once you are done to stop the knox instance use

cd {GATEWAY_HOME}
bin/gateway.sh stop

Now when we are up and running we will configure SAML auth in Knox. Its a three part process.

1) Configure Keycloak as Identity provider in Knox

Copy “conf/topologies/knoxsso.xml” to “conf/topologies/keycloak.xml”. Now edit it and delete the “ShiroProvider” provider and add the following provider Pac4j provider instead

Full configuration will look like this


   
     
         federation
         pac4j
         true
         
           pac4j.callbackUrl
			https://localhost:18443/gateway/keycloak/api/v1/websso
         

         
           clientName
           SAML2Client
         

         
           saml.identityProviderMetadataPath
           http://localhost:8080/auth/realms/master/protocol/saml/descriptor
          

         
 
           saml.serviceProviderMetadataPath
           ./sp-metadata.xml
         

         
           saml.serviceProviderEntityId
           https://localhost:18443/gateway/keycloak/api/v1/websso?pac4jCallback=true&client_name=SAML2Client
         
     
     
   

   
       KNOXSSO
       
         knoxsso.cookie.secure.only
         true
      
      
        knoxsso.token.ttl
        100000
      
      
         knoxsso.redirect.whitelist.regex
         ^https?:\/\/(www\.local\.com|localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1):[0-9].*$
      
   

Note: There is an “saml.serviceProviderMetadataPath” parameter in the config. This is the service provider metadata path. Knox will generate this the first time you access the service.

Navigate to ” https://localhost:18443/gateway/keycloak/api/v1/websso” and the file will be generated in you /bin folder.

2) Configure Knox as a service provider in Keycloak

Now in the second part we need to configure Keycloak to Authenticate the requests coming from Knox. To achieve this login to Keycloak admin panel and navigate to Clients and click create.

On the next screen import the sp-metadata.xml file and click save

3) Secure a topology using the “SSOCookieProvider” provider

In this step we will secure our Topoloy to authenticate with our SSO service to achieve this create a topology which is secured using a cookie issued by Knox SSO. Copy “conf/topologies/sandbox.xml” to “conf/topologies/sandbox-sso.xml” . The final contents of the file will look like this.



    

        
            federation
            SSOCookieProvider
            true
            
                sso.authentication.provider.url
                https://localhost:18443/gateway/keycloak/api/v1/websso
            
        

    

    
        NAMENODE
        hdfs://localhost:8020
    

    
        JOBTRACKER
        rpc://localhost:8050
    

    
        WEBHDFS
        http://localhost:50070/webhdfs
    

    
        WEBHCAT
        http://localhost:50111/templeton
    

    
        OOZIE
        http://localhost:11000/oozie
        
            replayBufferSize
            8
        
    

    
        WEBHBASE
        http://localhost:60080
        
            replayBufferSize
            8
        
    

    
        HIVE
        http://localhost:10001/cliservice
        
            replayBufferSize
            8
        
    

    
        RESOURCEMANAGER
        http://localhost:8088/ws
    

    
        DRUID-COORDINATOR-UI
        http://localhost:8081
    

    
        DRUID-COORDINATOR
        http://localhost:8081
    

    
        DRUID-BROKER
        http://localhost:8082
    

    
        DRUID-ROUTER
        http://localhost:8082
    
    
    
        DRUID-OVERLORD
        http://localhost:8090
    

    
        DRUID-OVERLORD-UI
        http://localhost:8090
    


Add individual services that you want to be protected in the <service> section.

To check the configuration you can navigate to https://localhost:18443/gateway/sandbox-sso/webhdfs/v1/?op=LISTSTATUS

Tags: big data, hadoop, Keycloak, security

2 comments

  • Digvijay Sawant June 24, 2020 at 11:54 pm Reply

    Hello Shiva Sir,
    Your Blog is very useful for securing the Hadoop cluster using KeyClock.
    I followed your steps, but I am facing some issue
    1] After authenticating from keycloak , URL is not redirecting to my Hadoop cluster giving ERROR 500 -Problem accessing /gateway/keycloak/api/v1/websso Request Failed.
    Is this issue of Redirecting URL via Knox to Hadoop Cluster?
    How to give redirecting URL for other topologies(What should change done in .xml file) for Hadoop Cluster?
    Thanks for Your Blog

  • Shiva July 19, 2020 at 2:07 am Reply

    Hi Digvijay,

    This could be because the keycloak is not running. Please check if you are able to access /gateway/keycloak/api/v1/websso from the browser or cheking the error in the logs will also help.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Calendar

July 2019
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
293031  
« Jun    

Archives

  • July 2019
  • June 2019
  • August 2016
  • April 2016
  • December 2015
  • August 2015
  • July 2015
  • June 2015
  • June 2014
  • October 2013
  • August 2013
  • March 2013
  • February 2013

Categories

  • Other Stuff
  • Programming
  • Technology
  • Uncategorized

Categories

  • Other Stuff
  • Programming
  • Technology
  • Uncategorized

No Rights Reserved. | Theme by ThemeinProgress | Proudly powered by WordPress