# Vault Honeypots I spoke at an event in Oslo where Vesselin gave a talk about how honey pots could be used to automate network security by automatically updating cloud armor rules to block the attacker. At the time I was giving a talk on Terraform and Vault, but as I watched Vesselin's talk I wanted to build a system that could automate firewall rules using Terraform. ## Introduction Before we continue let's take a quick look at what a honeypot is, the best way to find any definition, is to google it. Not quite what I was looking for, let's checkout the second listing. ``` A honeypot is a network-attached system set up as a decoy to lure cyber attackers and detect, deflect and study hacking attempts to gain unauthorized access to information systems. The function of a honeypot is to represent itself on the internet as a potential target for attackers usually, a server or other high-value asset and to gather information and notify defenders of any attempts to access the honeypot by unauthorized users. ``` Better, so we have a fake server, in our instance a fake Vault server, we want to report any malicious activity from this server and use it to configure our firewall. The flow that Vesselin proposed was this, we are going to build something very similar. Before we do, let's look at the last part of this puzzle. What is infrastructure as code? So now we have defined what things are, let's see the system in action. We are going to play you a video as we are going to live code this and there is a very good chance things will go horribly wrong. ## Honeypot in Action **Show Video** ## Building the system This is what our system is going to look like, it is fairly simple for demo purposes. The flow which looks very similar to Vessilin's looks like this.. **Explain Flow** ## Managing firewall rules using infrastructure as code So let's take a look at our first part, how are we going to update our cloud armor rules? We are going to use some Terraform. Let's build that up. First we need to define a variable to store the banned ip addresses in, we will use this to dynamically generate our firewall rules. **1_variable variables.tf** ```javascript variable "deny_list" { default = "" description = "Deny list for application security policy" } ``` Next let's create a local variable that takes our terraform input variable and converts it into an array **2_local loadbalancer.tf** ```javascript locals { deny_list = split(",", trim(var.deny_list,",")) } ``` Then we can define our cloud armor policy **3_rule loadbalancer.tf** ```javascript resource "google_compute_security_policy" "security-policy-1" { name = "armor-security-policy" description = "security policy for cloud run" # By default allow all traffic rule { action = "allow" priority = "2147483647" match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } } description = "Lowest priority rule, all other rules will be evaluated first" } } ``` What we need to do is to add another rule to this that is built up from our variable. In fact we need to add multiple rules for each blocked ip address, that looks like this. **4_dynamic loadbalancer.tf** ```javascript # Create a Deny rule for all ips in the deny_list variable dynamic "rule" { for_each = local.deny_list content { action = "deny(403)" priority = "${index(local.deny_list, rule.value)}" match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = [rule.value] } } } } ``` Once all that is done we can assign the rule to the backend **5_policy loadbalancer.tf** ```javascript security_policy = google_compute_security_policy.security-policy-1.self_link ``` Now all that is done let's run a terraform plan to check that things are working and push the code. You can see that terraform cloud is running a plan, we can see the changes, so let's press apply and our rules get updated. ## Automating Terraform from Honeypot Events This is fine, what we really want to do is to automate this so that an event from the Honeypot automatically triggers this update. Let's create a new cloud function that will update the variables in Terraform whenever the the Honeypot detects a new threat. ### New Attack We are going to write this in Go but you could use any language that is supported by Cloud Functions. **6_attack_func new-attack/main.go** ```go func init() { // Register a CloudEvent function with the Functions Framework functions.CloudEvent("NewAttackFunction", newAttackFunction) } // Function newAttackFunction accepts and handles a CloudEvent object // and updates terraform with the details of the new attacker. func newAttackFunction(ctx context.Context, e event.Event) error { } ``` The first thing we need to do is to grab some settings that we are passing as environment variables to the function **7_env_vars new-attack/main.go** ```go log.Println("New Attack Event called with data", string(e.Data())) tfeAddress := os.Getenv("TFE_ADDRESS") tfeToken := os.Getenv("TFE_TOKEN") tfeWorkspace := os.Getenv("TFE_WORKSPACE") tfeVariable := os.Getenv("TFE_VARIABLE") ``` The payload that Vault sends is a simple json payload containing the ip address we need to parse this from the payload sent in the event. **8_decode new-attack/main.go** ```go // Decode the message message, err := decodeMessage(e) if err != nil { log.Println("Unable to decode message", err) // Do not attempt re-delivery as the message is invalid return nil } ``` **9_decode_func new-attack/main.go** ```go type pubSubData struct { Message pubSubMessage `json:"message"` } type pubSubMessage struct { Data string `json:"data"` } type attackMessage struct { IP string `json:"ip"` } type applyMessage struct { Type string `json:"type"` } // decodeMessage decodes the message from the event func decodeMessage(e event.Event) (*attackMessage, error) { var data pubSubData var message attackMessage // decode the data err := json.Unmarshal(e.Data(), &data) if err != nil { return nil, fmt.Errorf("unable to deserialize data: %s", err) } // parse the message md, err := base64.StdEncoding.DecodeString(data.Message.Data) if err != nil { return nil, fmt.Errorf("unable to decode message data: %s", err) } err = json.Unmarshal(md, &message) if err != nil { return nil, fmt.Errorf("unable to deserialize message: %s", err) } return &message, nil } ``` Once the message is decoded we can get the attackers ip **10_check_ip new-attack/main.go** ```go // If the message has no payload do nothing if message.IP == "" { log.Println("Message does not contain an IP address, skipping") return nil } ``` Then we can construct a tfe client and get the existing value of the variable **11_tfe_client new-attack/main.go** ```go // Create a tfe client config := &tfe.Config{ Address: tfeAddress, Token: tfeToken, RetryServerErrors: true, } // Create a TFE client tfeClient, err := tfe.NewClient(config) if err != nil { log.Println("Unable to create TFE client", err) return err } denyList, err := getVariableFromTFE(tfeClient, tfeWorkspace, tfeVariable) if err != nil { log.Println("Unable to get variable from TFE", err) return err } ``` **12_get_func new-attack/main.go** ```go // getVariableFromTFE gets the given variable from a workspace func getVariableFromTFE(client *tfe.Client, workspace, variable string) (string, error) { ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() vars, err := client.Variables.List(ctx, workspace, &tfe.VariableListOptions{}) if err != nil { return "", fmt.Errorf("unable to list variables for workspace %s: %s", workspace, err) } for _, v := range vars.Items { if v.Key == variable { return v.Value, nil } } return "", nil } ``` Once we have the variable let's check to see that it is not already on our list **13_update_ip new-attack/main.go** ```go // Add the IP to the variable if not present log.Printf("Updating variable '%s', current value: '%s', adding '%s'\n", tfeVariable, denyList, message.IP) newDenyList, err := addIPIfNotPresent(denyList, message.IP) if err != nil { log.Println("Address already exists in list, quitting") return nil } ``` **14_update_func new-attack/main.go** ```go func addIPIfNotPresent(variable, messageIP string) (string, error) { // the variable is stored as a comma separated list, build an array ips := strings.Split(strings.TrimSuffix(variable, ","), ",") for _, ip := range ips { if messageIP == ip { return "", fmt.Errorf("ip allready exists in collection") } } // add the ip to the collection and update TFE ips = append(ips, messageIP) ipsString := strings.Join(ips, ",") return ipsString, nil } ``` If it is not already on the list, lets update TFE with the variable **15_update_tfe new-attack/main.go** ```go err = updateVariableInTFE(tfeClient, tfeWorkspace, tfeVariable, newDenyList) if err != nil { log.Println("Unable to update varaible", err) return err } ``` **16_tfe_update_func new-attack/main.go** ```go // updatesAVariableInTFE updates the given variable with the new value func updateVariableInTFE(client *tfe.Client, workspace, variable, value string) error { ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() vars, err := client.Variables.List(ctx, workspace, &tfe.VariableListOptions{}) if err != nil { return fmt.Errorf("unable to list variables for workspace %s: %s", workspace, err) } for _, v := range vars.Items { if v.Key == variable { _, err := client.Variables.Update(ctx, workspace, v.ID, tfe.VariableUpdateOptions{Value: &value}) if err != nil { return fmt.Errorf("unable to update variable: %s", err) } return nil } } return nil } ``` Finally once TFE has been updated, we can then trigger the update by sending a new message to a different pubsub queue. **17_trigger_plan new-attack/main.go** ```go log.Println("Triggering plan") // trigger a cloud update err = triggerPlan() if err != nil { log.Println("Unable to trigger pubsub update", err) return err } return nil ``` **17_trigger_plan new-attack/main.go** ```go // triggerPlan publishes a message to the apply topic, this in turn // starts a new plan in TFE func triggerPlan() error { ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() // Sets your Google Cloud Platform project ID. projectID := os.Getenv("GCP_PROJECT_ID") if projectID == "" { return fmt.Errorf("GCP_PROJECT_ID is not set") } // Creates a client. client, err := pubsub.NewClient(ctx, projectID) if err != nil { return fmt.Errorf("failed to create pubsub client: %s", err) } defer client.Close() // Creates the new topic. topic := client.Topic("tfe-apply-topic") message := &applyMessage{ Type: "new_attack", } data, _ := json.Marshal(message) topic.Publish(ctx, &pubsub.Message{Data: data}) topic.Stop() return nil } ``` ### Run Apply Let's now look at the apply function, like the last function we need to fetch some variables and get the payload **18_decode run-apply/main.go** ```go tfeAddress := os.Getenv("TFE_ADDRESS") tfeToken := os.Getenv("TFE_TOKEN") tfeWorkspace := os.Getenv("TFE_WORKSPACE") log.Println("New Apply Event called with data", string(e.Data())) // decode the data data, err := decodeMessage(e) if err != nil { log.Println("Unable to deserialize data", err) // no return nil } ``` **19_decode run-apply/main.go** ```go type pubSubData struct { Message pubSubMessage `json:"message"` } type pubSubMessage struct { PublishTime time.Time `json:"publishTime"` Data string `json:"data"` } // decodeMessage decodes the message from the event func decodeMessage(e event.Event) (*pubSubData, error) { var data pubSubData // decode the data err := json.Unmarshal(e.Data(), &data) if err != nil { return nil, fmt.Errorf("unable to deserialize data: %s", err) } return &data, nil } ``` Then like the other function we create the client **20_create_client run-apply/main.go** ```go // create a tfe client config := &tfe.Config{ Address: tfeAddress, Token: tfeToken, RetryServerErrors: true, } // Create a TFE client tfeClient, err := tfe.NewClient(config) if err != nil { log.Println("Unable to create TFE client", err) return err } ``` Then let's get the latest run status **21_get_status run-apply/main.go** ```go ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second) defer cancel() status, createdTime, err := getLatestRunStatus(tfeClient, tfeWorkspace) if err != nil { log.Println("Unable to get run status") return fmt.Errorf("unable to get run status: %s", err) } ``` **22_get_status_func run-apply/main.go** ```go // getLatestRunStatus gets the latest run status for the workspace func getLatestRunStatus(client *tfe.Client, workspace string) (tfe.RunStatus, time.Time, error) { runs, err := client.Runs.List(context.Background(), workspace, &tfe.RunListOptions{}) if err != nil { return "", time.Time{}, fmt.Errorf("unable to query runs: %s", err) } if len(runs.Items) > 0 { return runs.Items[0].Status, runs.Items[0].CreatedAt, nil } return "", time.Time{}, nil } ``` **23_check run-apply/main.go** ```go // if the last run was after the publish time for this message ignore if createdTime.Sub(data.Message.PublishTime) > 0 { log.Println("Ignore message, a run has been created after this message was published", err) return nil } // if the current run status is planning or applying, redeliver the message later if status == tfe.RunPlanning || status == tfe.RunApplying { log.Printf("Current run status is %s, retry later\n", status) return fmt.Errorf("retry message") } ``` Finally we can run an apply **24_check run-apply/main.go** ```go // Create the plan err = applyConfig(tfeClient, tfeWorkspace) if err != nil { log.Printf("Error creating plan, %s\n", err) return nil } // Return nil if no error occurred return nil ``` **25_apply_func run-apply/main.go** ```go // applyConfig creates a new plan and apply in TFE func applyConfig(client *tfe.Client, workspace string) error { ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second) defer cancel() ws, err := client.Workspaces.ReadByID(ctx, workspace) if err != nil { return fmt.Errorf("unable to read workspace: %s", err) } log.Println("Creating Plan") run, err := client.Runs.Create(context.Background(), tfe.RunCreateOptions{Workspace: ws, Type: "runs"}) if err != nil { return fmt.Errorf("unable to create plan: %s", err) } // Runs are async, check the state and when done apply for { if ctx.Err() != nil { return fmt.Errorf("timeout waiting for run") } // Check the state of the run run, err := client.Runs.Read(ctx, run.ID) if err != nil { return fmt.Errorf("unable to check plan status: %s", err) } if run.Status == tfe.RunPlannedAndFinished { log.Println("No changes to make") return nil } if run.Status == tfe.RunErrored { return fmt.Errorf("run failed: %s", run.Message) } if run.Status == tfe.RunPlanned { log.Println("Applying Plan") // Run the apply err := client.Runs.Apply(ctx, run.ID, tfe.RunApplyOptions{}) if err != nil { return fmt.Errorf("unable to apply plan: %s", err) } } if run.Status == tfe.RunApplied { log.Println("Apply complete", err) return nil } time.Sleep(10 * time.Second) } } ``` Let's now deploy this We can now test it