fn get_links(link_nodes: Select) -> Option<String> {

        let mut rel_permalink: Option<String> = for node in link_nodes {
            link = String::from(node.value().attr("href")?);

            return Some(link);
        };

        Some(rel_permalink)
    }

This is what I’m trying to do, and I’ve been stuck with this code for an hour, I simply don’t know how to put this function togheter… Essentially I would like to take some link_nodes and just return the link String, but I’m stuck in the use of Option with the ? operator… Pheraps trying to write it with match would clear things out(?)

Also I come from JavaScript in which expressions do not have their own scope, meaning I’m having troubles to understand how to get out a variable from a for loop, should I initialize the rel_permalink variable as the for loop result?

This are the errors i get:

error[E0308]: mismatched types
  --> src/main.rs:55:49
   |
55 |           let mut rel_permalink: Option<String> = for node in link_nodes {
   |  _________________________________________________^
56 | |             link = String::from(node.value().attr("href")?);
57 | |
58 | |             return Some(link);
59 | |         };
   | |_________^ expected `Option<String>`, found `()`
   |
   = note:   expected enum `Option<String>`
           found unit type `()`
note: the function expects a value to always be returned, but loops might run zero times
  --> src/main.rs:55:49
   |
55 |         let mut rel_permalink: Option<String> = for node in link_nodes {
   |                                                 ^^^^^^^^^^^^^^^^^^^^^^ this might have zero elements to iterate on
56 |             link = String::from(node.value().attr("href")?);
   |                                                          - if the loop doesn't execute, this value would never get returned
57 |
58 |             return Some(link);
   |             ----------------- if the loop doesn't execute, this value would never get returned
   = help: return a value for the case when the loop has zero elements to iterate on, or consider changing the return type to account for that possibility
  • taladar@sh.itjust.works
    link
    fedilink
    arrow-up
    13
    ·
    13 days ago

    Your return will return from the function, not from the for loop as you probably assume. The for loop itself does not return a value. Only loop based loops can use break to return values, other loops do not.

    You also forgot the let keyword in your assignment

    I assume you want to return the value of the href attribute for the first node that has one? In that case you want something like

    fn get_first_href_value(link_nodes: Select) -> Option<String> {
            for node in link_nodes {
                if let Some(href_value) = node.value().attr("href") {
                    return Some(href_value.into());
                }
            }
    
            None
    }
    

    or, more idiomatically

    fn get_first_href_value(link_nodes: Select) -> Option<String> {
        link_nodes.into_iter().find_map(|node| node.value().attr("href")).map(|v| v.to_string())
    }
    
  • asudox@programming.dev
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    13 days ago

    Please explain what it is you are trying to do.

    As far as I can understand from the function return value and the body of the function, you are trying to compose all of the href attributes in each node into a single String, which is encapsulated in an Option. Is this correct? Something looks weird but I can’t quite catch it. I hope you didn’t take a quick glance over the rust book and just started assuming things work like JS in Rust.

  • RustyNova@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    edit-2
    13 days ago

    I wouldn’t mind having a explanation of what you want to do instead of the code. It’s not quite clear what you mean.

    Anyways, what you want is to transform an iterator (your Select) into an iterator of Option<String>?

    For that, there’s multiple ways but here’s the simplest:

    link_nodes.map(|node| node.value().attr("href").to_string())

    Essentially, for each elements, we execute a closure (arrow function in JavaScript) that transform the node into your href string.

    P.S. can’t guarantee it works, I don’t know what this “Select” type is, and I’m programming on mobile

  • Badland9085@lemm.ee
    link
    fedilink
    arrow-up
    2
    ·
    9 days ago

    The return statement always returns for the function, regardless of scope within the function. So what you want is just Some(link) without the semicolon.

    That said, I’m not quite sure if you’re using the for loop correctly here. Are you trying to get just one value out of link_nodes?

  • TehPers@beehaw.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 days ago

    It’s hard to tell what it is you’re trying to do here, but maybe Option isn’t the right type? To me it feels like you’d want to return a type like Vec or an iterator.

    I would recommend looking at some of the iterator functions to do this. You could look at filter_map, collect, and fold/try_fold and see if any of those help you here.

  • gigapixel@mastodontti.fi
    link
    fedilink
    arrow-up
    1
    ·
    13 days ago

    @dontblink It’s a little unclear what you want to do. It looks like Select implements into iterator. As far as I can parse your code you want to get the first node with a href element. In that case you should do:

    link\_nodes.into\_iter().filter\_map(|node| node.value().attr("href")).map(String::from).next()  
    
    • gigapixel@mastodontti.fi
      link
      fedilink
      arrow-up
      1
      ·
      13 days ago

      @dontblink Here’s a little explanation for the methods
      - into\_iter : converts Select into an iterator (a for loop does this implicitly)
      - filter\_map: takes an iterator and constructs an iterator from an Fn(T) -\> Option\<S\> where the emitted elements are the elements for which applying the function/closure is Some
      - next: takes the next element of the iterator as an Option.

        • gigapixel@mastodontti.fi
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          13 days ago

          @dontblink If you really want to do it in a loop, you could do something like

          fn get\_links(link\_nodes: Select) -\> Option\<String\> {  
           let mut rel\_permalink: Option\<String\> = None;  
           for node in link\_nodes {  
           if let Some(link) = node.value().attr("href")? {  
           rel\_permalink = Some(String::from(link));  
           break  
           }  
           };  
           rel\_permalink  
          }  
          

          That said, your function name suggest you want _all_ links, so some kind of collection of links. Is this the case?

          • dontblinkOP
            link
            fedilink
            English
            arrow-up
            1
            ·
            13 days ago

            Hi! First of all thank you so much for the detailed explanation!

            What I’m trying to do is scraping some content.

            Yes I’m trying to return all links (maybe in a vector), I have a list of elements (Select, which actually is scraper::html::Select<'_, '_>) which contain essentially html nodes selections, and I would like to grab each of them, extract the actual link value (&str), convert it into an actual String and push it firstly into a vector containing all the links and then in an istance of a struct which will contain several datas about the scraped page later.

            I was trying to use a for loop because that was the first structure that came to my mind, I’m finding it hard to wrap my head around ownership and error handling with rust, using the if let construct can be a good idea, and I didn’t consider the use of break!

            I also managed to build the “match version” of what I was trying to achieve:

            fn get_links(link_nodes: scraper::html::Select<'_, '_>) -> Vec<String> {
                    let mut links = vec![];
            
                    for node in link_nodes {
                        match node.value().attr("href") {
                            Some(link) => {
                                links.push(link.to_string());
                            }
                            None => (),
                        }
                    }
            
                    dbg!(&links);
                    links
                }
            

            I didn’t understand that I had to return the same type for each of the Option match arms, I thought enum variants could have different types, so if the Some match arm returns (), also None has to do the same…

            If I try with a simpler example I still cannot understand why I cannot do something like:

            enum OperativeSystem {
                        Linux,
                        Windows,
                        Mac,
                        Unrecognised,
                    }
            
                    let absent_os = OperativeSystem::Unrecognised;
                    find_os(absent_os);
            
                    fn find_os(os: OperativeSystem) -> String {
                        match os {
                            debian => {
                                let answer = "This pc uses Linux";
                                answer.to_string()
                            }
                            windows10home => {
                                let answer = "This pc uses Windows, unlucky you!";
                                answer.to_string()
                            }
                            ios15 => {
                                let answer = "This pc uses Mac, I'm really sorry!";
                                answer.to_string()
                            }
                            _ => {
                                let is_unrecognised = true;
                                is_unrecognised
                            }
                        }
                    }
            

            match is much more intuitive for a beginner, there’s a lot of stuff which go under the hood with ?

            • asudox@programming.dev
              link
              fedilink
              arrow-up
              1
              ·
              edit-2
              12 days ago

              Here’s what you are trying to do, with a one liner:

              fn get_links(mut link_nodes: Select) -> Vec<String> {
                  link_nodes.retain(|node| node.value().attr("href").is_some()).into_iter().fold(Vec::new(), |links, node| links.push(link.value().attr("href").unwrap().to_string()))
              }
              

              edit: shorter and updated version:

              fn get_links(mut link_nodes: Select) -> Vec<String> {
                  link_nodes.into_iter().filter_map(|node| node.value().attr("href").map(|href| href.to_string())).collect()
              }
              

              The retain method is to get rid of all the nodes which don’t have a href attribute and the fold method after it is to extract the href out of the nodes and push them into the vector.

              It might work or not, I’ve written this from my memory and I can’t exactly know what that Select is.

              I also hope you begin reading The Book without half assing it.

              • taladar@sh.itjust.works
                link
                fedilink
                arrow-up
                2
                ·
                12 days ago

                You should use filter_map instead of the retain and later unwrapping and you don’t need a fold to build a Vec from an iterator, you can just use collect for that at the end.

                • asudox@programming.dev
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  12 days ago

                  here, it definitely is shorter, I’ll keep filter_map in mind, thanks:

                  fn get_links(mut link_nodes: Select) -> Vec<String> {
                      link_nodes.into_iter().filter_map(|node| node.value().attr("href").map(|href| href.to_string())).collect()
                  }
                  
            • gigapixel@mastodontti.fi
              link
              fedilink
              arrow-up
              1
              ·
              13 days ago

              @dontblink Yeah, I think it makes sense to use match as a beginner just to understand what is going on. In the end the x? just desugars to

              match x {  
               Some(value) =\> value,  
               None =\> return None  
              }  
              

              for options an

              match x {  
               Ok(value) =\> value,  
               Err(err) =\> return Err(err.into())  
              }  
              

              for results, where that .into() converts the error type into the return type of function.

              • gigapixel@mastodontti.fi
                link
                fedilink
                arrow-up
                1
                ·
                13 days ago

                @dontblink For stuff like this, I tend to prefer rusts iterator combinators, but I admit it’s a bit more advanced. So in this case you’d do

                fn get\_links(link\_nodes: scraper::html::Select\<'\_, '\_\>) -\> Vec\<String\> {  
                 link\_nodes  
                 .into\_iter() // convert select to iterator  
                 .filter\_map(|node| node.value().attr("href")) // only select nodes with an href attribute, and select that value  
                 .map(ToString::to\_string) // map that value to a string  
                 .collect()  
                }
                
                
                • gigapixel@mastodontti.fi
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  13 days ago

                  @dontblink The idea of the iterator is to assemble all of your computation lazily and only perform it at the final step. In this case you don’t actually perform any the actions until you call collect, which essentially creates a loop that goes through the original Select and performs the calculations and pushes the results to a Vec.