Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SkBuffContext and IPv6 #82

Open
jornfranke opened this issue Dec 31, 2022 · 14 comments
Open

SkBuffContext and IPv6 #82

jornfranke opened this issue Dec 31, 2022 · 14 comments

Comments

@jornfranke
Copy link

Hi,

I am writing a socket filter (will be open sourced once working). I use this as a basis: https://github.com/aya-rs/book/tree/main/examples/cgroup-skb-egress, but it is not for cgroups_skb, but for socket filter.

I have made it working for IPv4 and IPv6. However, in the eBPF program itself I am only able to get properly the IPv4 address. In case of IPv6 sockets, I somehow get no proper IP address.

I extract the IPv4 address similarly to here:
https://github.com/aya-rs/book/blob/main/examples/cgroup-skb-egress/cgroup-skb-egress-ebpf/src/main.rs#L46

This is my code to extract the IPv6 address:

 u128::from_be(ctx.load(offset_of!(ipv6hdr, saddr)).unwrap()) 

It is obvious that this is not an IP address as this is changing all the time. This is ipv6hdr:

pub struct ipv6hdr {
    pub _bitfield_align_1: [u8; 0],
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 1usize]>,
    pub flow_lbl: [__u8; 3usize],
    pub payload_len: __be16,
    pub nexthdr: __u8,
    pub hop_limit: __u8,
    pub saddr: [__be32; 4usize],
    pub daddr: [__be32; 4usize],
}

I am not sure if case of IPv6 the ipv6hdr is at a different offset and I need to add sth.

Any idea?

Thanks a lot.

@FallingSnow
Copy link

Are you taking into account the ethernet header offset?

ETH_HDR_LEN + offset_of!(iphdr, saddr)

@jornfranke
Copy link
Author

Thanks for the quick answer.

For IPv4 for sock_filter (not TC) it is not needed and it works there perfectly without ETH_HDR. For IPv6 with or without it makes no difference - Maybe I do not correctly understand what is in SkBuffContext when an AF_INET6 socket is used...

@jornfranke
Copy link
Author

Maybe some more context. This is my ebpf program

[..]

#[socket_filter(name = "sock_egress")]
pub fn sock_egress(ctx: SkBuffContext) -> i64 {
  match try_sock_egress(ctx) {
        Ok(ret) => ret,
        Err(_) => 0,
    }

}
[..]
n try_sock_egress(ctx: SkBuffContext) -> Result<i64, i64> {

    // determine protocol
    // only process ipv4 and ipv6 packet
        // determine protocol
        let h_proto = unsafe { (*ctx.skb.skb).protocol };
      
        // only process ipv4 and ipv6 packages
        let ip_version: u32 = match h_proto {
            ETH_P_IP => 4,
            ETH_P_IPV6 => 6,
            _ => return Ok(0), // drop packet
        }; 
    // determine destination of the packet
    let destination: u128 = 0;
    let destination: u128 = match ip_version {
        4 => 
        u32::from_be(ctx.load(offset_of!(iphdr, saddr)).unwrap()) as u128,
        6 => { 

            u128::from_be(ctx.load(offset_of!(ipv6hdr, saddr)).unwrap())
            
        },
        _ => 0,
    };
 [..]

As said - the IPv4 part works correctly (correct IP etc.), the IPv6 part - I think I am missing something.

@jornfranke
Copy link
Author

Ahh I see, if the protocol is IPv6 then I do not get the IP header in the data of skbuff, but only the TCP header... So this part is consistent with the raw socket API for IPv6.
Any idea on how to get somehow the IP address then in the eBPF program?

@jornfranke
Copy link
Author

jornfranke commented Dec 31, 2022

Or if this is feasible at all? I just want to explore what is possible with a socket filter, I am aware that I can also use XDP or TC.
My use case is => a user space program has a raw socket to inspect all IP packets. In order to increase performance I want to prefilter (not drop!) packets based on basic information, such as IP address. For instance, the user space program should only look at the packets with "suspicious" IP addresses and the rest should not even reach the user space program...

@FallingSnow
Copy link

Ahh I see, if the protocol is IPv6 then I do not get the IP header in the data of skbuff, but only the TCP header... So this part is consistent with the raw socket API for IPv6.

Oh, I've never used skb before. Had no idea it was any different.

Sorry if I'm not following. Are you saying https://docs.aya-rs.dev/bpf/aya_bpf/bindings/struct.__sk_buff.html has protocol but the ip address fields aren't filled out?

@jornfranke
Copy link
Author

jornfranke commented Dec 31, 2022

Well the ip address fields are only filled out for BPF type: BPF_PROG_TYPE_SK_SKB (see: https://blogs.oracle.com/linux/post/bpf-a-tour-of-program-types) - not for BPF_PROG_TYPE_SOCKET_FILTER

It seems - the only way to access ancialliary data (ie the ipv6 header) is to load it from a negative offset:
https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/include/uapi/linux/filter.h#L60

It seems that aya only allows unsigned (positive) offsets. Can anyone confirm this assumption?

@FallingSnow
Copy link

Hmm, interesting. It seems like aya will only load with an offset usize so only positive.

@alessandrod
Copy link
Collaborator

Ahh I see, if the protocol is IPv6 then I do not get the IP header in the data of skbuff, but only the TCP header

What makes you think this? Here's an example of a socket filter parsing IP headers https://github.com/torvalds/linux/blob/03421a92f5627430d23ed95df55958e04848f184/samples/bpf/sockex2_kern.c#L100

Well the ip address fields are only filled out for BPF type: BPF_PROG_TYPE_SK_SKB (see: https://blogs.oracle.com/linux/post/bpf-a-tour-of-program-types) - not for BPF_PROG_TYPE_SOCKET_FILTER

I haven't written a socket filter in a long time but from what I can see in the kernel source, this doesn't seem to be true either?

@jornfranke
Copy link
Author

thanks a lot.

Ahh I see, if the protocol is IPv6 then I do not get the IP header in the data of skbuff, but only the TCP header

What makes you think this? Here's an example of a socket filter parsing IP headers https://github.com/torvalds/linux/blob/03421a92f5627430d23ed95df55958e04848f184/samples/bpf/sockex2_kern.c#L100

Because the data starts with the TCP header - I can parse port etc. successfully. It could be the different ways on how the raw socket is opened. I use:

let fd: i32 = unsafe { libc::socket(libc::AF_INET6, libc::SOCK_RAW, libc::IPPROTO_TCP) };

In this way - even without eBPF - I receive in the user space program only the TCP header, which is normal (cf. e.g.https://schoenitzer.de/blog/2018/Linux%20Raw%20Sockets.html).

Will try with

let fd: i32 = unsafe { libc::socket(libc::AF_PACKET, libc::SOCK_DGRAM, libc::ETHERTYPE_IPV6) };

Well the ip address fields are only filled out for BPF type: BPF_PROG_TYPE_SK_SKB (see: https://blogs.oracle.com/linux/post/bpf-a-tour-of-program-types) - not for BPF_PROG_TYPE_SOCKET_FILTER

I haven't written a socket filter in a long time but from what I can see in the kernel source, this doesn't seem to be true either?

Well I just quoted the blog - this can have changed in different kernel versions, but contrary to see BPF functions to see which one are allowed and which one are not, finding out which fields are filled out in the view __sk_buff by the kernel is not so obvious (or I look in the wrong place). Where do you see this in the kernel source? Just for clarification - some of the fields in the view __sk_buff are filled, but not all - especially not the ip address ones if i use sock_filter.

It seems according to the tests the blog is correct: https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/verifier/ctx_skb.c

Nevertheless, I could also overlook sth. here.

Maybe the confusion comes from the different types of raw sockets used (I used an AF_INET6 one, the example you reference seems to be at a lower level, possibly AF_PACKET).

@jornfranke
Copy link
Author

jornfranke commented Jan 1, 2023

Somehow, I have issues with Rust and a raw socket with AF_PACKET. For example, this quick and dirty C program works

#include<errno.h>
#include<stdio.h>	
#include<stdlib.h>	
#include<netinet/if_ether.h>	
#include<sys/socket.h>


int main()
{
		
	unsigned char *buffer = (unsigned char *) malloc(65536); //Its Big!
	
	int data_size;
	int sock_raw = socket( AF_PACKET , SOCK_RAW , htons(ETH_P_ALL)) ;
	
	if(sock_raw < 0)
	{
		//Print the error with proper message
		perror("Socket Error");
		return 1;
	}
	while(1)
	{   
        printf("Receiving");
		//Receive a packet
		data_size = recv(sock_raw , buffer , 65536 , 0);
		if(data_size <0 )
		{
			printf("Recv error , failed to get packets\n");
			return 1;
		}
        printf("%d",data_size);
	}
	pclose(sock_raw);
	printf("Finished");
	return 0;
}

It shows packages are received over the raw socket with AF_PACKET.

However the equivalent quick and dirty Rust program does not show anything. It shows just "Enter loop" and then it is blocked in recv.

fn main() {

        // create raw socket
        let fd: i32 = unsafe { libc::socket(libc::AF_PACKET, libc::SOCK_RAW, libc::ETH_P_ALL) };
        if fd < 0 {
            println!("Error socket");
            return;
        }
        
        let mut buffer = vec![0u8; 4096].into_boxed_slice();
        while (true) {
            println!("Enter loop");
 
       let result=unsafe{libc::recv(fd, buffer.as_mut_ptr() as *mut libc::c_void, buffer.len(),0)};
        if result < 0 {
           
            println!("Error read");
        } else {
            println!("Size: {}",result);
        }
    }
}

It works though with AF_INET,AF_INET6

@jornfranke
Copy link
Author

Ok, found the issue in the Rust program (had to simulate htons(ETH_P_ALL)

n main() {

        // create raw socket
          let fd: i32 = unsafe { libc::socket(libc::AF_PACKET, libc::SOCK_RAW, (libc::ETH_P_ALL as u16).to_be() as i32) };
        if fd < 0 {
            println!("Error socket");
            return;
        }
        
        let mut buffer = vec![0u8; 4096].into_boxed_slice();
        while (true) {
            println!("Enter loop");
 
       let result=unsafe{libc::recv(fd, buffer.as_mut_ptr() as *mut libc::c_void, buffer.len(),0)};
        if result < 0 {
           
            println!("Error read");
        } else {
            println!("Size: {}",result);
        }
    }
}

Will for now with the raw packets, I just wonder if the following makes in aya sense (or if I misunderstood how it works)?

It seems - the only way to access ancialliary data (ie the ipv6 header) is to load it from a negative offset:
https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/include/uapi/linux/filter.h#L60

If it makes sense then I can close this issue and create a new issue for aya on this, if not then I simply close this issue.

Please let me know.

@alessandrod
Copy link
Collaborator

alessandrod commented Jan 1, 2023

It seems - the only way to access ancialliary data (ie the ipv6 header) is to load it from a negative offset:
https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/include/uapi/linux/filter.h#L60

If it makes sense then I can close this issue and create a new issue for aya on this, if not then I simply close this issue.

https://github.com/torvalds/linux/blob/150aae354b817f540848476bace2b2ba9931b197/net/core/filter.c#L340

It seems to me that that stuff is only for (classic) BPF and is mapped to just accessing skb->$field in eBPF?

@jornfranke
Copy link
Author

Good question, I will investigate. I can confirm so that with an aya socket filter I did not had access to remote_ip etc. only to protocol (essentially that were allowed by the tests shown in the Linux kernel). I do not think this was related to aya.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants