Gloo Network comes with an insights engine that automatically analyzes your Cilium setup for health issues. Then, Gloo shares these issues along with recommendations to harden your Cilium setup. The insights give you a checklist to address issues that might otherwise be hard to detect across your environment.

Launch the Gloo UI

To review the Gloo Network analysis of and insights for your setup, launch the Gloo UI.

  1. Open the Gloo UI. The Gloo UI is served from the gloo-mesh-ui service on port 8090. You can connect by using the meshctl or kubectl CLIs.

    • meshctl: For more information, see the CLI documentation.
        meshctl dashboard
        
    • kubectl:
      1. Port-forward the gloo-mesh-ui service on 8090.
          kubectl port-forward -n gloo-mesh svc/gloo-mesh-ui 8090:8090
          
      2. Open your browser and connect to http://localhost:8090.
  2. Review your Dashboard. The dashboard shows an at-a-glance overview of your Gloo environment, including your Cilium setup.

    Figure: Gloo UI dashboard

Review installation health and insights

On the Analysis and Insights card of the dashboard, you can quickly see a summary of the insights for your environment, including how many insights are available at each severity level, and the type of insights. To view the list of insights, you can click the Details buttons, or go to the Insights page.

Figure: Insights and analysis card

View all insights

On the Insights page, you can view recommendations to harden your Cilium setup, and steps to implement them in your environment. Gloo Network analyzes your setup, and returns individual insights that contain information about errors and warnings in your environment, best practices you can use to improve your configuration and security, and more.

Figure: Insights page

In the list of all insights, each insight has the following attributes:

  • Level: The severity level of the insight.
    • Info: Informational reports, such as summaries of the current state of resources, or best practice recommendations, such as steps you can take to conform to Istio standards.
    • Warning: Potential issues that might affect the functionality of your setup.
    • Error: Issues that currently affect the functionality of your setup, and must be resolved.
  • Category: The type of the insight.
    • Best Practice: Best practice recommendations, such as scoping resources to namespaces.
    • Configuration: Configuration of Cilium resources, such as validation checks or recommended fields.
    • Health: Health checks and status updates for components of your Cilium installation.
    • Security: Security of your Cilium setup.
  • Summary: A short description of the insight.
  • Resource: The name, namespace, and cluster of the resource that the insight refers to. For example, argocd-vs.argocd.mgmt refers to the virtual service named arcgocd-vs in the arcgocd namespace of your mgmt cluster.

You can optionally use the filters to view insights by severity level or category, and the Search by cluster dropdown to filter insights by particular clusters.

Resolve insights

For detailed information about how to resolve each insight, click Details.

Figure: Example insight
  • Summary: The summary tab shows more data about the insight, such as the time when it was last observed in your environment, and details about configuration fields that might need attention. This example insight warns that a virtual service is exported to all namespaces, which is not recommended for security reasons.
  • Target YAML: If applicable, the YAML shows the resource file that the insight references, such as a virtual service or gateway.
  • View Resolution Steps: If applicable, the resolution tab provides steps that you can take to resolve the insight. For example, you might follow the steps to change specific settings in your Cilium resources. Or, for further functionality and benefits, you might consider upgrading to other Solo products, such as Gloo Mesh Enterprise.

Review Cilium health

Review the following dashboard cards to monitor your Cilium installation’s health.

Check overall installation health

The Cilium and Gloo health card of the dashboard provides a check of the Cilium and Gloo Network installations in your clusters.

Gloo

The Gloo tab provides an at-a-glance status of the health of each Gloo Network component. You can click the button next to each component to view its pod logs. The environment check shows all versions of Gloo Network that are installed in your environment, the state of each installation, and the number of clusters in your environment.

Figure: Gloo installation tab in the health card

Cilum

The Cilium tab provides an at-a-glance status of the health of each Cilium component. The environment check shows the count and state of all versions of Cilium that are installed in your environment, and can help you identify overall issues with your installations.

Figure: Cilium installation tab in the health card

Check node connectivity

The Node Connectivity card of the dashboard shows the number of nodes that Cilium reports as connected or disconnected across all clusters in your Gloo Network setup.

Cilium tracks each node’s ability to connect to other nodes by performing connectivity checks between the Cilium agent on the node and other Cilium nodes. When a node is reported as disconnected, it implies one or more other nodes are unable to establish connectivity to it.

Figure: Node Connectivity card in the Gloo UI

To view a list of all nodes in your Cilium setup, you can click the Details button, or go to the Inventory > Nodes page. You can use the Healthy and Unhealthy buttons to sort the nodes, such as to find disconnected nodes.

Figure: Node inventory list in the Gloo UI

Review network policy coverage

The Network Policies card of the dashboard shows the percentage of Cilium endpoints in your clusters that have a network policy applied, and the number of policy violations in the last 5 minutes.

Figure: Network Policies card in the Gloo UI

To see a list of all policies in your environment, you can click the Details button, or go to the Resources > Cilium page. You can use the Healthy and Unhealthy buttons to sort the policies, or click View YAML to see each policy’s configuration file.

To check which services and traffic requests are violating your network policies, you can click the Details button, or go to the Observability > Hubble UI page. For more information, see Hubble UI.

Check scalability health

The Cilium Health card of the dashboard shows scalability checks of the current eBPF map pressure, IP address exhaustion, and endpoint regeneration time in your environment. To ensure that your Cilium setup can scale well in a growing environment, ensure that these three cards are blue. If the checks reach the notification threshold, the cards become yellow, such as the Endpoint Regeneration Time card in the following example; if they reach the maximum limit, they become red, such as the BPF Map Pressure and IPAM Exhaustion cards. In these cases, you can click Details to see the associated insights for these checks.

Figure: Cilium Health card in the Gloo UI
  • BPF Map Pressure: The pressure of the eBPF maps, as measured against the map’s set max limit. eBPF map pressure can become an issue as the cluster experiences increased load. Cilium provides sufficient map defaults for typical load levels, but if you have a large number of identities, you might experience map pressure.
  • IPAM Exhaustion: IP Address Management (IPAM) allocates and manages IP addresses that are used by network endpoints (container and others) that Cilium manages. Because IP addresses are limited according to the specified CIDR, IP address exhaustion can occur.
  • Endpoint Regeneration Time: When the policy enforced on a Cilium endpoint changes due to a change in identity, policy, or configuration, the Cilium agent regenerates the endpoint. During this process, the agent updates the endpoint’s networking configuration, which can include reprogramming eBPF programs that power the data path for that endpoint in the Linux kernel. Endpoint regeneration can be slow when you have many IP addresses or nodes in your cluster, because Cilium agents might attempt to regenerate many IP endpoints concurrently.

Disable insights

As you resolve insights in your environment, you might want to ignore or remove some insights instead of resolving them. For example, an insight that gives a warning for production usage might not be relevant when you try out a new feature in a sandbox Istio environment.

To disable an insight and remove it from your insights list in the Gloo UI:

  1. Upgrade your Gloo CRDs Helm chart to include the --set featureGates.insightsConfiguration=true flag.
      helm upgrade -i gloo-platform-crds gloo-platform/gloo-platform-crds \
       --namespace=gloo-mesh \
       --create-namespace \
       --version=$GLOO_VERSION \
       --set installEnterpriseCrds=false \
       --set featureGates.insightsConfiguration=true
      
  2. Open the Gloo UI. The Gloo UI is served from the gloo-mesh-ui service on port 8090. You can connect by using the meshctl or kubectl CLIs.

    • meshctl: For more information, see the CLI documentation.
        meshctl dashboard
        
    • kubectl:
      1. Port-forward the gloo-mesh-ui service on 8090.
          kubectl port-forward -n gloo-mesh svc/gloo-mesh-ui 8090:8090
          
      2. Open your browser and connect to http://localhost:8090.
  3. From the left-hand navigation, click Home > Insights.
  4. Find the insight’s code by clicking the insight’s Details, and look for the Code.
  5. Include the insight’s code in an InsightsConfig resource. For example, the following resource disables the CFG0002 and CFG0003 insights.
      kubectl apply -f - << EOF
    apiVersion: admin.gloo.solo.io/v2alpha1
    kind: InsightsConfig
    metadata:
      name: insights-config
      namespace: gloo-mesh
    spec:
      disabledInsights:
        - CFG0002
        - CFG0003
    EOF