Securing Hadoop Big Data Landscape with Apache Knox Gateway and Keycloak: Part 4 (Configuring Knox to Authenticate with Keycloak)
Prerequisites
As both Keycloak and Knox are Java based application, the only prerequisite is to have Java installed and JAVA_HOME environment variable set . I have tested the application with Java 8.
Setting up Keycloak
Getting started with Keycloak is as easy as downloading the latest version of Keycloak Server from the download section of the web site and navigating to the “bin” directory and running standalone.bat or standalone.sh file
You can then access keycloak at localhost:8080, The first time you will be required to create and initial password for the admin realm. Once you are done with it, its pretty much it. You can play around with it after logging to the admin realm.
Setting up apache knox
Download the latest Gateway Server Binary from the Knox release website. For this post the configuration is based on version 1.2.0 of the gateway.
Start LDAP embedded in Knox
Knox comes with an LDAP server for demonstration purposes. Start it using the following command
cd {GATEWAY_HOME} bin/ldap.sh start
Create the Master Secret
Run the “knox-cli.sh” command in order to persist the master secret that is used to protect the key and credential stores for the gateway instance.
cd {GATEWAY_HOME} bin/knoxcli.sh create-master
Start Knox
Before you start the Knox instance, navigate to “conf” and change the property “gateway.port” to 18443 (or something else) in gateway-site.xml as we alread have Keycloak running on 8080 and 8443. Then you can start a knox instance with the following command
cd {GATEWAY_HOME} bin/gateway.sh start
Once you are done to stop the knox instance use
cd {GATEWAY_HOME} bin/gateway.sh sto
Now when we are up and running we will configure SAML auth in Knox. Its a three part process.
Configure Keycloak as Identity provider in Knox
Copy “conf/topologies/knoxsso.xml” to “conf/topologies/keycloak.xml”. Now edit it and delete the “ShiroProvider” provider and add the following provider Pac4j provider instead
Full configuration will look like this
<topology> <gateway> <provider> <role>federation</role> <name>pac4j</name> <enabled>true</enabled> <param> <name>pac4j.callbackUrl</name> <value>https://localhost:18443/gateway/keycloak/api/v1/websso</value> <param> <name>clientName</name> <value>SAML2Client</value> <param> <name>saml.identityProviderMetadataPath</name> <value>http://localhost:8080/auth/realms/master/protocol/saml/descriptor</value> <param> <name>saml.serviceProviderMetadataPath</name> <value>./sp-metadata.xml</value> <param> <name>saml.serviceProviderEntityId</name> <value>https://localhost:18443/gateway/keycloak/api/v1/websso?pac4jCallback=true&client_name=SAML2Client</value> </provider> </gateway> <service> <role>KNOXSSO</role> <param> <name>knoxsso.cookie.secure.only</name> <value>true</value> <param> <name>knoxsso.token.ttl</name> <value>100000</value> <param> <name>knoxsso.redirect.whitelist.regex</name> <value>^https?:\/\/(www\.local\.com|localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1):[0-9].*$</value> </service> </topology>
Note: There is an “saml.serviceProviderMetadataPath” parameter in the config. This is the service provider metadata path. Knox will generate this the first time you access the service.
Navigate to ” https://localhost:18443/gateway/keycloak/api/v1/websso” and the file will be generated in you /bin folder.
Configure Knox as a service provider in Keycloak
Now in the second part we need to configure Keycloak to Authenticate the requests coming from Knox. To achieve this login to Keycloak admin panel and navigate to Clients and click create.
On the next screen import the sp-metadata.xml file and click save
Secure a topology using the “SSOCookieProvider” provider
In this step we will secure our Topoloy to authenticate with our SSO service to achieve this create a topology which is secured using a cookie issued by Knox SSO. Copy “conf/topologies/sandbox.xml” to “conf/topologies/sandbox-sso.xml” . The final contents of the file will look like this.
<!--?xml version="1.0" encoding="utf-8"?--> <topology> <gateway> <provider> <role>federation</role> <name>SSOCookieProvider</name> <enabled>true</enabled> <param> <name>sso.authentication.provider.url</name> <value>https://localhost:18443/gateway/keycloak/api/v1/websso</value> </provider> </gateway> <service> <role>NAMENODE</role> <url>hdfs://localhost:8020</url> </service> <service> <role>JOBTRACKER</role> <url>rpc://localhost:8050</url> </service> <service> <role>WEBHDFS</role> <url>http://localhost:50070/webhdfs</url> </service> <service> <role>WEBHCAT</role> <url>http://localhost:50111/templeton</url> </service> <service> <role>OOZIE</role> <url>http://localhost:11000/oozie</url> <param> <name>replayBufferSize</name> <value>8</value> </service> <service> <role>WEBHBASE</role> <url>http://localhost:60080</url> <param> <name>replayBufferSize</name> <value>8</value> </service> <service> <role>HIVE</role> <url>http://localhost:10001/cliservice</url> <param> <name>replayBufferSize</name> <value>8</value> </service> <service> <role>RESOURCEMANAGER</role> <url>http://localhost:8088/ws</url> </service> <service> <role>DRUID-COORDINATOR-UI</role> <url>http://localhost:8081</url> </service> <service> <role>DRUID-COORDINATOR</role> <url>http://localhost:8081</url> </service> <service> <role>DRUID-BROKER</role> <url>http://localhost:8082</url> </service> <service> <role>DRUID-ROUTER</role> <url>http://localhost:8082</url> </service> <service> <role>DRUID-OVERLORD</role> <url>http://localhost:8090</url> </service> <service> <role>DRUID-OVERLORD-UI</role> <url>http://localhost:8090</url> </service> </topology>
Add individual services that you want to be protected in the <service> section.
To check the configuration you can navigate to https://localhost:18443/gateway/sandbox-sso/webhdfs/v1/?op=LISTSTATUS
2 comments
Leave a Reply Cancel reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
Hello Shiva Sir,
Your Blog is very useful for securing the Hadoop cluster using KeyClock.
I followed your steps, but I am facing some issue
1] After authenticating from keycloak , URL is not redirecting to my Hadoop cluster giving ERROR 500 -Problem accessing /gateway/keycloak/api/v1/websso Request Failed.
Is this issue of Redirecting URL via Knox to Hadoop Cluster?
How to give redirecting URL for other topologies(What should change done in .xml file) for Hadoop Cluster?
Thanks for Your Blog
Hi Digvijay,
This could be because the keycloak is not running. Please check if you are able to access /gateway/keycloak/api/v1/websso from the browser or cheking the error in the logs will also help.